[ 
https://issues.apache.org/jira/browse/DIRSTUDIO-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441608#comment-16441608
 ] 

Emmanuel Lecharny commented on DIRSTUDIO-1174:
----------------------------------------------

Antlr is not bad. For a standard implementation, I would say that {{antlr}} is 
around 2.5 times slower than an hand written parser.

The problem here is that we are invoking {{antlr}} from {{antlr}} to parse 
sub-elements, and it costs a hell lot of CPU. 

That being said, my hand written parser is not yet covering all what the 
existing parser covers (typically, it's quite strict in what it accepts, while 
the {{antlr}} parser allows less strict rules, to be able to handle schema from 
'exotic' LDAP servers... However, I can parse pretty much all forms for 
{{AttributeType}} and {{ObjectClass}}. What is missing is the 
{{objectIdentifier}}, which is a prefix used in the {{OID}} (that is pretty 
simple to implement) and the proper handling of {{qdescr}}/{{qdstring}} (the 
quoted description/quoted string).
I may push a limited version of the parser tomorrow, up to the caller to use 
it, and default to the slower version if an exception is raised (best case : it 
gets the data fine and fast, worse case, the job is done twice but it's almost 
invisible).

I'll update the ticket tomorrow, atm I'm trying to have the existing tests 
passing with the fast parser.

No matter what, your work is certainly useful. I'm pretty sure it will be a 
valuable addition to the project, in teh long run.



> Directory Studio startup very slow due to schema LDIF processing
> ----------------------------------------------------------------
>
>                 Key: DIRSTUDIO-1174
>                 URL: https://issues.apache.org/jira/browse/DIRSTUDIO-1174
>             Project: Directory Studio
>          Issue Type: Bug
>          Components: studio-connection
>    Affects Versions: 2.0.0-M13
>         Environment: openSUSE Linux (installed on my laptop)
> Sun/Oracle Java 1.8.0_111 (previously 1.7 with same issue)
> Apache Directory Studio 2.0.0 M12 and M13, plus earlier milestones too
>            Reporter: Aaron Burgemeister
>            Priority: Major
>              Labels: LDIF, schema, startup-time
>         Attachments: 20180415-no-load-schema-ldif-by-default.patch, 
> 20180416-dirstudio-1174-fix-a.patch
>
>
> For the past couple years startup of Apache Directory Studio has slowed down 
> to the point where it takes more than a minute on my not-a-slouch laptop to 
> start.  Other systems, VMs with new installs, start much faster, even on the 
> same laptop, implying something other than the base product is at fault.  As 
> a result, I had suspected maybe Directory Studio slowed down precipitously 
> due to the number of stored connections, but never confirmed the same.
> Today I connected strace to the 'java' process as it started and noticed the 
> following:
>  
> [pid 30108] *1521902717*.154740 
> open("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core/schema-ba001fb7-4b83-4dca-be44-517c14139f4b.ldif",
>  O_RDONLY) = *-1 ENOENT (No such file or directory)*
> [pid 30108] *1521902717*.154906 
> stat("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core",
>  \{st_mode=S_IFDIR|0755, st_size=5378, ...}) = 0
> [pid 30108] *1521902717*.154948 
> open("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core/schema-95e1202e-9a67-418c-afe9-b02f4e7c06df.ldif",
>  O_RDONLY) = *-1 ENOENT (No such file or directory)*
> [pid 30108] *1521902717*.155019 
> stat("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core",
>  \{st_mode=S_IFDIR|0755, st_size=5378, ...}) = 0
> [pid 30108] *1521902717*.155053 
> open("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core/schema-687f43f6-9d05-4d08-b159-35b0e76dc95a.ldif",
>  O_RDONLY) = *-1 ENOENT (No such file or directory)*
> [pid 30108] *1521902717*.155120 
> stat("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core",
>  \{st_mode=S_IFDIR|0755, st_size=5378, ...}) = 0
> [pid 30108] *1521902717*.155154 
> open("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core/schema-d62d0e10-c81e-4477-81a2-ac2c9e5c7169.ldif",
>  O_RDONLY) = *121*
> [pid 30108] *1521902718*.698702 
> stat("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core",
>  \{st_mode=S_IFDIR|0755, st_size=5378, ...}) = 0
> [pid 30108] *1521902718*.698800 
> open("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core/schema-7b6a9a7c-2192-4b24-8874-1378e5b1b30c.ldif",
>  O_RDONLY) = *126*
> [pid 30108] *1521902719*.770570 
> stat("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core",
>  \{st_mode=S_IFDIR|0755, st_size=5378, ...}) = 0
> [pid 30108] *1521902719*.770660 
> open("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core/schema-b3b02838-067f-4f24-bf92-6bf3fccdbc52.ldif",
>  O_RDONLY) = *127*
> [pid 30108] *1521902721*.198417 
> stat("/home/ab/.ApacheDirectoryStudio/.metadata/.plugins/org.apache.directory.studio.ldapbrowser.core",
>  \{st_mode=S_IFDIR|0755, st_size=5378, ...}) = 0
>  
> Notice the timestamps (bolded near beginning of line) and how they change 
> based on whether or not a schema LDIF file was found (bolded near end of 
> line) and, presumably, processed.  When a file is not found, subsequent files 
> are sought immediately without significantly delaying startup.
> These schema files are all under 1 MiB in size, but most of them are several 
> hundred KiBs, approaching the 1 MiB size, so depending on what Directory 
> Studio is doing as it reads and processes these files, it would seem that 
> this introduces the slowness when a file is found.
> Looking for an existing issue I found DIRSTUDIO-1027 which may be related.  
> During startup of Directory Studio one of my laptop's eight cores is fully 
> utilized, which makes me think this may be more about processing the LDIF 
> than just swapping memory due to inefficient data structures, but I am not a 
> memory management expert, so I only mention the possibility here in case it 
> helps find the root cause quickly.
> My Directory Studio's total startup time: sixty-one (61) seconds.
> Time spent (per strace) reading schema files: fifty-five (55) seconds.
> Estimated non-schema startup time: six (6) seconds.
>  
> Steps to duplicate:
> Have a lot, e.g. 100, of stored schema LDIF files from previous connections.
> Startup Apache Directory Studio.
> Expected results: Startup quickly.  Processing old schema LDIFs, when most of 
> them will not be used at any given time, seems like a waste of time in 
> general.  Perhaps this can be done only when a connection is accessed in some 
> way rather than at startup.
> Actual results: Slow startup.
> Reproducible: I think so, but am not sure why my system has these schema 
> LDIFs when others may not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to