Athosvk added a comment.

The change to USR seems like quite an improvement already! That being said, I 
do think that it might be preferable to opt out of the use of strings for 
linking things together. What we did with our clang-doc is that we directly 
used pointers to refer to other types. So for example, our class for storing 
Record/CXX related information has something like:

  std::vector<Function*>        mMethods;
  std::vector<Variable*>        mVariables;
  std::vector<Enum*>    mEnums;
  std::vector<Typedef*> mTypedefs;

Only upon serialization we fetch some kind of USR that would uniquely identify 
the type. This is especially useful to us for the conversion to HTML and I 
think the same would go for this backend, as it seems this way you'll have to 
do string lookups to get to the actual types, which would be inefficient in 
multiple aspects. It can make the backend a little more of a one-on-one 
conversion, e.g. with one of our HTML template definitions (note: this is a 
Jinja2 template in Python):

  {%- for enum in inEntry.GetMemberEnums() -%}
        <tr class="separator">
                <td class="memSeparator" colspan="3"></td>
        </tr>
        <tr class="memitem:EAllocatorStrategy">
                <td class="memItemLeft" align="right">{{- 
Modifiers.RenderAccessModifier(enum.GetAccessModifier()) -}}</td>
                <td class="memItemMiddle" align="left">enum <a href="{{ 
enum.GetID() }}.html">{{- enum.GetName().GetName()|e -}}</a></td>
                <td class="memItemRight" valign="bottom">{{- 
Descriptions.RenderDescription(enum.GetBriefDescription()) -}}</td>
        </tr>
  {%- endfor -%}

Disadvantage is of course that you add complexity to certain parts of the 
deserialization (/serialization) for nested types and inheritance, by either 
having to do so in the correct order or having to defer the process of 
initializing these pointers. But see this as just as some thought sharing. I do 
think this would improve the interaction in the backend (assuming you use the 
same representation as currently in the frontend). Also, we didn't apply this 
to our Type representation (which we use to store the type of a member, 
parameter etc.), which stores the name of the type rather than a pointer to it 
(since it can also be a built-in), though it embeds pretty much every possible 
modifier on said type, like this:

  EntryName                     mName;                                          
                        
  bool                          mIsConst = false;                               
                
  EReferenceType                        mReferenceType = EReferenceType::None;  
  std::vector<bool>             mPointerConstnessMask;                          
        
  std::vector<std::string>      mArraySizes;                                    
                
  bool                          mIsAtomic = false;                              
                
  std::vector<Attribute>                mAttributes;                            
                        
  bool                          mIsExpansion = false;                           
        
  std::vector<TemplateArgument> mTemplateArguments;                             
                
  std::unique_ptr<FunctionTypeProperties>     mFunctionTypeProperties = 
nullptr;                
  EntryName                     mParentCXXEntry;

The last member refers to the case where a pointer is a pointer to member, 
though some other fields may require some explaining too. Anyway, this is just 
to give some insight into how we structured our representation, where we 
largely omitted string representations where possible.

Have you actually started work already on some backend? Developing backend and 
frontend in tandem can provide some additional insights as to how things should 
be structured, especially representation-wise!



================
Comment at: clang-doc/Representation.h:113
+  TagTypeKind TagType;
+  llvm::SmallVector<std::unique_ptr<MemberTypeInfo>, 4> Members;
+  llvm::SmallVector<std::string, 4> ParentUSRs;
----------------
How come these are actually unique ptrs? They can be stored directly in the 
vector, right? (same for CommentInfo children, FnctionInfo params etc.)


https://reviews.llvm.org/D41102



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to