Thursday, March 28, 2024

How to identify the point which is to be skewed for tile to tile interface from timing reports

 Need for skewing. Skewing need to be done in the following cases 

  • High skew would increase the number of buffers getting added in the lowsvs corners 
  • Lowsvs setup is critical and each buffer added in lowsvs would add 200ps and same would only give 18ps or less in high vt corners. 
  • Memory paths which need to be balanced would be critical and they mostly need to be skewed 



How to identify the point which need to be skewed. 

  • Most likely these would be mux outputs. 
  • We can identify this by a crude way by filtering without a design knowledge by using the following simple command 
Steps 
  • The below command would give you the common pins which are repeated n number of times. 
  • Pick up the points from below and check if the skewing can be done at the below points. 
  • Identify if these are outside the block A and block B where they can be skewed at the top interface 
  • This would reduce the analysis time. 
  • If you have design knowledge you can directly identify the points which can be skewed based on design knowledge





Wednesday, March 27, 2024

debugging high latency during CTS

 During CTS we often see multiple clocks getting balanced. Sometimes we also find clocks which are not be balanced getting balanced by the CTS engine. 

CTS engine tries to balance the clocks based upon all the clocks reaching the end point or sink pin

Step1 : Get all the clocks reaching the path with high latency sink pin using the following sink pin. 

Step2: identify muxes in the clock path 

Step3:  check clocks on b pin 

Step4: check clocks on a pin 

step5: check the case value on the mux sel pin

If there is no definition of the case_value on the sel pin. Then ideally there should have been two generated clocks defined on the output of the z pin and defined logically asynchronous to resolve the issue.



 Check all clocks on the sink pin 

pt_shell> get_attribute [get_pins  {module/u_cmux2_SIZE_ONLY/z} ] clocks

{"CLKM1_long_latency_clk", "CLKM1_main_clk"}


report the clock path using the below clock and identify mux in the path by reporting the problematic clock 

report_clock_timing  -to  {mux/reg[1]/clk} -type latency -clock  main_clock -verbose

Check the clocks on b pin of mux 

pt_shell> get_attribute [get_pins  {module/u_cmux2_SIZE_ONLY/b} ] clocks

{"CLKM1_main_clk"}

Check clocks on a pin of the mux 

pt_shell> get_attribute [get_pins  {module/u_cmux2_SIZE_ONLY/a} ] clocks

{"CLKM1_long_latency_clk"}

Check case value on the mux ( If the case value is not defined then both clocks get propagated 

pt_shell> get_attribute [get_pins  {module/u_cmux2_SIZE_ONLY/sel} ] case_value

==> no case value defined


create the generated clocks at the mux output with the two clocks on A pin and B pin 

create_generated_clock -name CLKM1_main_clk_DIV1 -source [get_ports \
 {main_port}] 
  -divide_by 1  -add -master_clock [get_clocks {CLKM1_main_clk}] 
 [get_pins -hsc module/u_cmux2_SIZE_ONLY/z}] 


create_generated_clock -name long_latency_clk_DIV1 -source [get_ports \
 {main_port}] 
  -divide_by 1  -add -master_clock [get_clocks {CLKM1_long_latency_clk}] 
 [get_pins -hsc module/u_cmux2_SIZE_ONLY/z}] 


set_clock_groups -logically_exclusive -group [get_clocks CLKM1_main_clk_DIV1 ] -group [get_clocks CLKM1_long_latency_clk ] 

Issue fixed ? Share your thoughts on how else can we fix this issue and ways of debugging. Pros and cons of those ways of fixing. 

Wednesday, October 31, 2018

Advanced On Chip Variation


Environment Conditions. 
               Just like all people on globe does not experience same environment conditions -all transistors on chip does not experience the same environment conditions which influence their functioning. 
Image result for globe picture with sunny and cold weather
Voltage and Temperature 
               These include variation in voltage which is applied to the transistors and temperature at that portion of the transistor. 

Width or shape of transistors. (Process) 

              Different people across the globe are built differently. People in Germany and Russia have their average height different from people in India.

Similar way the transistor may also be on different regions on die and chip. For the same region due to process variation, the width of transistor might be different in in different areas of the chip.

        This caused the delay to vary across the different regions on the chip.

We also fix timing across the maximum and minimum operating as operating conditions for the chip to function by taking these as setup and hold corners.

Image result for transistor delay variation with voltage and temperature
On chip Variation 

              To address these issues of variation in delay of transistors we basically add delay in transistors by adding a blanket value on each transistor delay to address these variations.

Advanced On Chip Variation

         Applying a blanket value of derate would be more pessimistic for your design.
AOCV tries to address two types of variations

1. Random Variation.
                As name says random variations are those which cannot be predicted. Like the metal thickness at that region, dioxide thickness , implant doses.

  •  Random variations are proportional to the depth and they decrease proportionally based on depth. 
  •  For example the if we put in the a pessimism of 10ps derate on a cell for random variation then 8 stages will have around 80ps of 
  •  But 80ps is too pessimistic for the overall path since the overall chance of each cell in the path suffering 10ps derate is less. 
  • Hence we decide the random variation based on the depth of the cell within the path and reduce it as the depth decreases. 



2. Systematic Variation
                   Systematic variation are based on the distance. They include gate length, gate width and interconnect width etc.


  • The more the distance of the devices physically on the die, the more is the derate applied on the device. 
  • Since the devices on different regions of the die experience  PVT differently like temperature across the globe is different regions. 





             
              

Saturday, July 8, 2017

virtual clocks and their usage

Virtual Clocks
               By definition a virtual clock is a clock which does not have a port. Which is not a real clock but it mimics the functionality of a real clock.  It is advantageous to use these for optimization by giving different values of jitter and uncertainity along with
               

Use of virtual clocks gives us the following advantages while modelling the


  1. Specify the clock latencies with respect to virtual clocks with different values compared to real clocks in your design. 
  2. Gives flexibility to check IO timing separately as by default different path groups are created for different clocks.. 
  3. Perform the budgeting by adjusting the IO delays and the clock latencies on them and perform your optimization accordingly.
  4. It also gives you the flexibility to give uncertanity and jitter values differently for your optimization and timing calculation purposes


           Below is how your IO delays can be modeled based with the help of virtual clocks.
1. step 1 create the virtual clocks
2. model the input delays and output delays with respect to the virtual clocks.
3. apply the clock latencies by the virtual clocks.


create_clock -name CLK1 -period 3.236 -waveform { 0 1.618 } [get_ports {O_CLK1}]
set_clock_transition 0.150 [get_clocks {CLK1}]
set_clock_uncertainty -setup 0.110 [get_clocks {CLK1}]
set_clock_latency 2.200 [get_clocks {CLK1}]

create_clock -name VIRT_CLK1 -period 3.236 -waveform { 0 1.618 }
set_clock_transition 0.150 [get_clocks {VIRT_CLK1}]
set_clock_uncertainty -setup 0.110 [get_clocks {VIRT_CLK1}]
set_clock_latency 2.200 [get_clocks {VIRT_CLK1}]

set_input_delay 1.6 -clock [get_clocks {VIRT_CLK1}] [get_ports {O_CLK}]

Friday, October 4, 2013

DRC LVS cleaning procedure

Cleaning DRC LVS can be harrowing experience if you do not know which target to address first and which to address next.
You can follow this order for making best out of your time


Temperature inversion~~cause and analysis

Let us first understand the basic concept of what causes the delay of the transistor from the below equation


fig 1. relation between current and delay of a mosfet 


  As you can find from the above that the more the current the less the delay of the PFET or NFET. Also you can refer to my previous blog on understanding the cell delay and transition.

Now you understand that cell delay depends on the current. Let us see what current depends on


Relation between mobility and current.

A phenomenon called lattice scattering happens at higher temperatures. Lattice vibrations cause the mobility to decrease with increasing temperature. Hence the resistance of a mosfet increases with increase in temperature causing current to reduce. 
       



From the above figure we find that the current I is directly proportional to the mobility of the semiconductor. Hence as the mobility decreases, the current decreases. Hence cell delay increases.
So , temperature increases ==> cell delay increases duet to mobility
As , temperature decreases ==> cell delay should decrease due to mobility

Now this looks quite satisfying and this is normally the case without temperature inversion. Then what is causing temperature variation ? It is the Vt change..


Temperature and Vt variation
                     
                             The threshold voltage of a mosfet decreases with increase in temperature as follows.  Here (alpha Vt ) is a constant variable which denotes decreases in Vt as -3mv/degC.
                 
Now put this in again in mosfet equation it becomes 

Here we observe that Vgs - Vt term is squared which means that your current increases drastically with changes in Vt. Hence delay of your mosfet decreases drastically with change in Vt as the current increases significantly. 

Now observe that delay of your mosfet  is basically dependent on two factors.
1.  mobility due to temperature variation.
2.  Vt varying with temperature

              The final drain current would depend on what dominates the drain current at the given temperature. In practical when you decrease the temperature , you will be surprised to know that you will not be  observing any temperature inversion. That is because there is one more factor of dependence which is supply voltage.

Cell delay and supply voltage

                    In the drain current equation (Vgs - Vt)squared , you observe that
1. when Vgs is high, (Vgs -Vt )amlost stays constant, hence your current is dependent on mobility
2. when Vgs is low,  ( Vgs-Vt) difference dominates, hence your current is dependent on Vt

      With all this happening all corners need to be addressed in the chip for successful functioning. Hence multi mode multi corners analysis is done before taping out the chips which is robust to catch all discrepancies.



Thursday, June 27, 2013

Why hold does not depend on clock frequency ?

If there is setup violation, the frequency of chip can be reduced and we can make the chip still function. But if there is hold violation then your chip is lost for ever. This is the most used phrase in physical design. But did you ever try to analyse why the hold violations does not depend on the frequency of the chip ?

First of all let us understand Setup and hold checks completely


Imagine data is travelling from FF1 to FF2 as shown in the figure. 

Look at the timing diagram below 

  • Data1(clock cycle1 data of FF1 ) is being sampled at FF2 in clock cycle2
  • Data2 (clock cycle2 data of FF1) is on its way to FF2 already 




From the figure 

 Setup check
                     It says that the data sampled from FF1 at cycle 1 should reach FF2 in cycle 2 before FF2 setup time.

Equation 

 Tc2q (FF1) + Tcomb = Tclk -Tsetup


Hold check
                   It says that the current data ==>Data 2, which is sampled from FF1 at cycle2,  should not arrive at FF2 at cycle2  before FF2 hold time. ( because it messes up with Data1 which is being currently captured by FF2 in cycle2 )
                  In other words there Data2 from FF1 should not mess with Data1 which is already at FF2 which is currently being sampled 
                     
T.c2q (clock to Q delay of FF1) +Tcomb  >= T (hold )

Your clock->Q delay and Tcomb are not at all dependent on the clock period. Hence your hold is independent of clock frequency.