Si2oaD/Regular Expressions
From oacwiki
In addition to the various attribute filter options, regular expressions can be used to limit output to a subset of matched Objects (and their surrounding context). Considerable familiarity with OpenAccess “owner” relationships is necessary to use Si2oaD regular expression filters effectively. The OA Information Model Schema Diagrams and Si2 Course UML models are useful references to aid visualization of these relationships.
Contents |
1 Summary
To control matches for output filtering, the oad() function is called with the “x” command along with a single pathExpression composed of one or more componentExpressions, which are in turn composed a typeRegex and an optional nameRegex in the following form:
typeRegex[=nameRegex] [typeRegex [nameRegex] ...]
As shown above, a componentExpression consists of
- one typeRegex designed to match the name of an OA Object Type
These Type names are the names of OA classes, but without the "oa" prefix. Every typeRegex is matched as case-insensitive. Consequently, the pathExpression "net" and "Net" both match all oa*Net* Types, which includes, for example, oaScalarNet as well as oaBusNetDef.
- an optional nameRegex that is designed to match names of Objects of the Type(s) matched by the typeRegex
With no nameRegex, all names for the Type are matched (equivalent to .*). For Objects with no name, such as Blocks, a nameRegex should not be entered. Case-sensitivity of the nameRegex matching is set by the casesensitive option in the options file (5.2.2), and may be overridden by a prefix to the pathExpression (6.3.3).
2 Syntax Details
After a pathExpression is specified (set with the oad() “x” command), the next oad() call to display data will use regex matching:
call oad( "x", "pathExpression" ) call oad(rootObject)
Regex matching is enabled and disabled independently of defining and changing the pathExpression itself. The regexsticky option can be used to force all subsequent oad() calls to use regex matching; otherwise, only the call immediately following the oad("x","...") command that set the expression will use it for matching.
POSIX extended ("modern") regex syntax is used for all regular expressions.
2.1 Delimiters
Delimiters are used to separate each typeRegex from the preceding one, and each nameRegex, from its typeRegex. These delimiters are set independently using the following options (6.3):
- xdelimtype: Two or more must precede each successive Object typeRegex (The default is a blank)
- xdelimname: One must separate a typeRegex from its nameRegex (The default is an equal sign)
These delimiters can be changed independently, as needed, for any pathExpression.
- If a nameRegex must use two consecutive blanks (the default xdelimtype) to match desired Objects in the data, the xdelimtype must be changed to a different character, one that will not be needed for the regex match in question. The delimiters can be changed as often a necessary in a session, though such changes must be made prior to executing the oad("x","...") command that needs them. Of course, the xdelimtype may not be a letter or number that might appear in an oaType name.
- Similarly, if a nameRegex contains =, the xdelimname must be changed to another character like + that does not occur in nameRegex.
For example, to match Terms in Nets with consecutive 'B' characters embedded in their names, the following pathExpression could be used:
(Qaz, unclear) Consider the following “x” command that will match Nets named as “myNetBB<digit>” and then Terms whose names end in “Erouted”.
call oad( "x", "net$=BB term$" )
However, suppose instead of “BB” in the Net names, consecutive blanks needed to be matched. These two blanks could not be distinguished from the double blank delimiting the end of the nameRegex from the following typeRegex. So a solution would be,
call oad( "o", "xdelimtype=~" ) call oad( "x", "net$= ~~term$" )
If the Net names to be matched instead had an equal sign, it would conflict with the equal sign separating the end of the typeRegex from the start of the nameRegex. So the xdelimname could be changed as follows (note that the xdelimtype could be set back to a blank, if that were no longer to be part of the nameRegex):
call oad( "o", "xdelimtype= " ) call oad( "o", "xdelimname=`" ) call oad( "x", "net$`= term$" )
2.2 Owner Hierarchy
When regex matching is in effect, a call to oad(rootObject) begins matching the first componentExpression against the Objects owned by the rootObject in the call (or from the top of the RTM if oad() is invoked without a rootObject). Consequently, the start and order of componentExpressions is critical. If the first typeRegex is not aligned with the start of the ownership hierarchy, nothing will be matched. For example, the following pathExpression will match Terms (of any name), owned by Nets that are named with a single, capital letter – but only from a call starting with a Block, which is the owner of Nets:
call oad( "x", "net=[A-Z] term=.*" ) call oad(block6);
If any rootObject other than a Block is used, nothing will match. The caller must know the ownership hierarchy of the OA information model to use regex matching effectively. This expression style enables exact targeting of Objects that can appear on Object Types at different levels of the owner hierarchy. For example, IntProps in a Block with specific Types of owners can be selected as follows:
- Net owners: oad( "x", "net intprop" )
- ScalarTerm owners: oad( "x", "net scalarterm intprop" )
- all Objects directly owned by Nets: oad( "x", "net .* intprop" )
The following pathExpression must be used from a call to oad() without a root Object argument since it begins matching the first contained Lib object:
call oad( "x", "lib Design block .*" )
The above expression will match,
- only Libs (e.g., not the Session) at the top of the RTM
- only Designs in each Lib (not, for example, Cells, Views, etc.)
- only the Block in the each Design (but not the Modules or Occurrence)
- every Object Type (directly) owned by the Block
2.3 Limitations
Since each successive level of the ownership hierarchy must be matched by each successive component in the pathExpression, there is no way to select, for example, "all IntProps, regardless of owner Object, throughout the Block's ownership hierarchy". The IntProps on all Block Nets or those on Terms can be selected but not both at the same time (since Terms and Nets have different owner Objects). Most OA Types have only a single Object Type that can be an owner; so this limitation is only relevant for Objects that can have different owner Types, such as Props, AppDefs (i.e., the instantiations of AppDefs attached to an oaObject), Constraints, Values etc.
If output=gui and display is being controlled incrementally (with maxownerlevels set to some low value), the callback code will attempt to guess where in the owner containment chain the request for more data is taking place. This is done by running regexec() on the request Object's Type, against the typeRegex in each of the componentExpressions starting from the beginning until a match is made. While in the large majority of use cases, the match will put reset the the current continment position to the correct level, the guess could be wrong for two reasons:
- As described above for regex in general, some Objects can have different owners at different levels of the owner hierarchy.
- A sufficiently general typeRegex could end up matching in an unexpected part of the owner chain.
2.4 Special Character Semantics
2.4.1 Name Semantics
The Lib, Cell, and View names of Designs are concatenated for matching and display purposes using the xdelimname (option) character as a separator. These concatenated names can then be matched using a nameRegex as though this composite were a single name attribute of the Design. For example, if xdelimname is the '=' character, the following calls will match all the Designs in lib6 with AND in the Cell name:
call oad( "x", " Design=^AND.*=.*=.*" ) call oad(lib6)
The pathExpression, "lib Design=^[lL]ib=.*j.*=schematic$", will match only Designs having a
- Lib name of "Lib" or "lib"
- View name with a "j" in it
- Cell name "schematic"
2.4.2 Bit Semantics
Bit semantics are treated using the special characters of the current NameSpace in effect (via the namespace option).
For example, in a NameSpace where [] are the vector semantics delimiters,
call oad( "x", "^busnet$=busNet[A-Z][0-9]\\[.*:20:[13579]\\]" )
The second argument to the call above is the pathExpression (this one containing only one componentExpression). Its syntax has several special symbols that constrain the match in various ways:
- ^ means the typeRegex match must start at the initial character. Without the leading ^ the typeRegex would also match ModBusNets and OccBusNets.
- $ means the typeRegex must match to the end of string (this prevents matching BusNetDef, BusNetBit)
- = is the delimiter (defined by the xdelimname option) that separates the typeRegex on the left from the nameRegex on the right
- [A-Z][0-9] are regex ranges meaning any single character between A and Z followed by any single digit between 0 and 9 will match
- [13579] will match any odd digit
- The double back slash is required only to escape the leading vector [ delimiter character (though in the expression the trailing ] is also escaped in this way):
- The first \ preserves the second as the string argument to the call is processed by gdb
- The second \ quotes the [ character, turning it into a literal (rather than the start of a regex range expression)
This pathExpression will match BusNets that have
- a base name ending with a capital letter and one digit
- any start value
- stop value of 20
- any single-digit, odd step value.
In the si2distest/test1.cpp testcase this expression will match only busNetA2[3:20:5] when called using the block variable (in Design Lib/Sample/schematic).
3 Regex Options
Several option (4.1.4) settings affect regex behavior.
3.1 xdelimname
As noted in the Name Semantics example above, successful matching of LCV names requires use of the delimiter character set using the xdelimname option.
3.2 regexsticky
This regexsticky option affects the way OAD calls that do not supply a regex are handled. If set to f, those calls will not use regex matching (without deleting any pathExpression), if set to t, then if a pathExpression has been set, it is used in regex matching until a different one is set, or regexsticky is set to f. The call below turns on "sticky":
call oad("o","regexsticky=t")
3.3 casesensitive
Case-insensitive matching (for nameRegex only) can be enabled by setting:
(gdb) call oad( "o", "casesensitive=f" )
With the above setting, all subsequent calls to oad() that use regex matching will ignore case. Regardless of the setting of the casesensitive option, a case override flag may be prepended to the front of a pathExpression to override the casesensitive option setting read from the .si2oadoptions file (or the default, if none) using a syntax of the form,
[{+|-}{i|s}] type[=nameRegex] [type[=nameRegex] ...]
For example, the following call forces case-insensitive matching:
(gdb) call oad( "x", "+i ^lib$ design$ block net$" )
This prefix
- may be separated from the first typeRegex by 0 or more xdelimtype characters
- must start with + or - where,
- +s and -i both meaning case-sensitive
- -s and +i both meaning case-insensitive
Note that typeRegex matching against Object Type names is always case-insensitive.
3.4 Help Information
Setting the option verbose=2 (or higher) will produce, at the time a pathExpression is defined, details about which objects will be matched and the case-sensitivity status. For example,
(gdb) call oad( "o", "verbose=2" ) (gdb) call oad( "x", "busnet=busNet[A-Z][0-9]\\[.*:20:[13579]\\]" ) [si2oad] Processing regex [si2oad] ...No case override since no leading + or - [si2oad] ...Deleted all prior regex entries [si2oad] ...busnet matches: BusNet BusNetBit BusNetDef OccBusNet OccBusNetBit OccBusNetDef ModBusNet ModBusNetBit ModBusNetDef
The same verbose level will also produce at run-time a log of which Objects in the RTM are selected and what names are matched, as shown in the following console excerpt:
[si2oad] ...MATCH automatically root Object in oad call [si2oad] Using regex but at root Object. Skipping CheckNamePattern [si2oad] ...SKIP TYPE=AppIntDef: NO MATCH to type-regex[0]=busnet (osd=1) [si2oad] ...SKIP TYPE=IntProp: NO MATCH to type-regex[0]=busnet (osd=1) ... [si2oad] Checkpointed at buf loc = 674 [si2oad] Erased output to checkpoint = 674 [si2oad] Skip NAME=busNetA1: no match to name-regex[0]=busNet[A-Z][0-9]\[.*:20:[13579]\] (osd=1) [si2oad] ...MATCH TYPE=BusNetDef to type-regex[0]=busnet (osd=1) [si2oad] Checkpointed at buf loc = -1 [si2oad] Erased output to checkpoint = -1 [si2oad] Skip NAME=busNetA2: no match to name-regex[0]=busNet[A-Z][0-9]\[.*:20:[13579]\] (osd=1) [si2oad] ...SKIP TYPE=BusTermDef: NO MATCH to type-regex[0]=busnet (osd=1) ... [si2oad] ...MATCH TYPE=BusNet to type-regex[0]=busnet (osd=1) [si2oad] Checkpointed at buf loc = 85 [si2oad] Erased output to checkpoint = 85 [si2oad] Skip NAME=busNetA1[3:20:2]: no match to name-regex[0]=busNet[A-Z][0-9]\[.*:20:[13579]\] (osd=1) [si2oad] ...MATCH TYPE=BusNet to type-regex[0]=busnet (osd=1) [si2oad] Checkpointed at buf loc = -1 [si2oad] MATCH NAME=busNetA2[3:20:5]: name-regex[0]=busNet[A-Z][0-9]\[.*:20:[13579]\] (osd=1) [si2oad] ...DISPLAY ALL remaining types in container: Past regex nEntries [si2oad] Checkpointed at buf loc = 826 [si2oad] MATCH NAME=busTermA2[4:21:5] AUTOMATICALLY: no name-regex[1] (osd=2) [si2oad] ...SKIP TYPE=ScalarNet: NO MATCH to type-regex[0]=busnet (osd=1) [si2oad] ...SKIP TYPE=ScalarNet: NO MATCH to type-regex[0]=busnet (osd=1)
4 Examples
The following examples begin after running the following:
cd $Si2oadINSTALLdir/si2distest make oad TEST=test1 b 286 c
The .si2oadoptions file settings are:
casesensitive t # Default case-sensitivite setting for regex regexsticky f # If t && regex set, it applies to all subsequent oad() xdelimtype b # Regex delimiter char between Objects [b means blank] xdelimname = # Regex delimiter char between name components
The verbose option is set to 2 to show extra information during processing.
(gdb) call oad("o","verbose=2")
(gdb) call oad("x"," lib design block net")
[si2oad] Processing regex
[si2oad] ...No case override since no leading + or -
[si2oad] ...Deleted all prior regex entries
[si2oad] ...lib matches: Lib AnalysisLib LibDefList LibDef LibDefListRef LibDMData
[si2oad] ...design matches: Design DesignInst
[si2oad] ...block matches: Block BlockBoundary AreaBlockage LayerBlockage
[si2oad] ...net matches: BundleNet BusNet BusNetBit ScalarNet BusNetDef NetConnectDef OccBundleNet OccBusNet OccBusNetBit OccScalarNet OccBusNetDef OccNetConnectDef ModBundleNet ModBusNet ModBusNetBit ModScalarNet ModBusNetDef ModNetConnectDef ParasiticNetwork SubNetwork SubNetworkMem
In setting the regex, note that the first component, lib, matches both Lib and AnalysisLib, while "block" matches any Type name with the word "block" in it. Prefixing a ^ and postfixing a $ to a Type name limits the match to Type names that start and end, respectively, with the indicated string. For example,
(gdb) call oad("x"," ^lib$ design$ block$ net$")
[si2oad] Processing regex
[si2oad] ...No case override since no leading + or -
[si2oad] ...Deleted all prior regex entries
[si2oad] ...^lib$ matches: Lib
[si2oad] ...design$ matches: Design
[si2oad] ...block$ matches: Block
[si2oad] ...net$ matches: BundleNet BusNet ScalarNet OccBundleNet OccBusNet OccScalarNet ModBundleNet ModBusNet ModScalarNet
The next call to oad() will show all Nets in the current RTM, along with their parent Objects up to the root Object in the call (or the top of the RTM if no root Object is used as an argument). All the attributes of the container Objects will display, but no other owned Objects besides Nets. However, all Objects continuing down the owner hierarchy from each Net owner will display. Single-line Collection indicators for all skipped Object collections will be displayed, along with the number of Objects in the Collection (though these are not expandable to show those Objects).
Since no name expressions were appended to any of the Type names, all names are matched for each of the matched Types. For example, only busNetA2[3:20:5] would be matched by the following regex,
(gdb) call oad("x","-s block net=busnet.*\[[0-9]*:20:[13579]]$")
...
(gdb) call oad(des)
In the call above, a root object was provided; hence the prior regex setting needed to start with the next Type of Object under that root (a Block in this case, since only Block Domain BusNets were to be matched).
5 Per-Collection Regex and Sorting
The GUI enables setting a different regex and/or sort criteria for each Collection – regardless of
- whether a "global" typeRegex/nameRegex pathExpression has been set.
- the maxownerlevels
- how many members of the Collection have been incrementally loaded
Two different user actions on the leading + or - on the Collection line (depending upon whether the Collection is currently closed or open) will provide access to this feature: either
- CONTROL-click
- RIGHT-click [LEFT-click for left-handed settings]
A pop-up dialog solicits the typeRegex, nameRegex and/or sort to be used for this Collection. On successful parse of any regex expressions, only Objects of the matched Types and names will be displayed in the Collection as it is incrementally perused. If sort was selected, the Objects appear in the display ordered according to the algorithm of the sort DLLs selected. Any combination of typeRegex, nameRegex, and sort may be applied repeatedly to the same Collection.
The following figure illustrates the use of the pop-up box to select only those constraintDefs whose names begin with “min” and display them in alphabetical order. The pop-up box appears when the leading – character (or the + if the node is folded) on the constraintDef line is control-clicked:
After SUBMIT is clicked, the constraintDefs are selected and sorted:
5.1 Pop-Up Reuse
The name of the Collection that was CTL-clicked is inserted into the “APPLY TO” (a.k.a., “submit”) button of the pop-up to help keep track of the GUI node the pop-up is affecting, as shown in the figure below. If a bad regex is inserted, the error message resulting from the attempt to compile it will be displayed in the status bar at the bottom of the window frame. The regex can be re-edited in place and tried again. In fact, the pop-up will remain open until some other click action is made in the window, at which time the popup will be automatically closed. The popup will even remain open (for subsequent analysis) if the “RETURN TO CONSOLE” button is pressed. However, as soon as the main GUI window is closed, the pop-up will be automatically closed.

Whether the regex is good or bad, the Collection can be re-sorted, and a different regex reapplied any number of times. Clicking in the display anywhere will destroy the popup.
If a regex is applied that results in zero matches, the leading + or - will disappear from the Collection line. In this situation, if the popup has been destroyed, the only way to access it again is via the CONTROL-click technique applied to the Collection name (since the leading indicator character is gone).
5.2 Case Sensitivity
The case sensitivity ( +i -i +s -s ) override prefixes can be inserted into the beginning of the nameRegex box. Once such an override has been used, all subsequent regex matches using the same pop-up will have that same override applied. Clicking elsewhere in the main window, however, will terminate that pop-up instance and its override.
These prefixes are not supported for the typeRegex since types are always matched ignoring case.
5.3 Overhead
The sort and re-regex features require caching of a Collection's members in an Array. For large Designs this can add up to a substantial amount of extra memory allocation overhead in the application process. Only incremental Collections for which a CTL-click has been submitted will incur this overhead. Hence, if these features are not used, no memory penalty results.

