On Wed, Jun 16, 2010 at 7:45 PM, Joseph Boyle <[email protected]>wrote:
> We could have an abstract class with more than one concrete implementation > subclass - unparsed string, node tree possibly including unparsed strings > for subexpressions, fully parsed but XRIs not verified, fully parsed and > verified. > > Can we depend on XDI messages being in a canonical format? If equivalent > messages with different text are possible, then string equality won't be > enough. > No, there is no canonical format. The same XDI graph can be serialized in different forms (because just like in RDF there is no built-in order of subjects, predicates and objects). But I'm not talking about the XDI graph itself, I'm talking about the XRIs inside the graph. Parsing =markus is simple, but parsing e.g. (/+!15$v!3) is less simple, and Sergey suggested this may introduce a performance bottleneck. > How can we collect data on how long parsing (and other operations) are > taking? > Sergey has done research on this and posted some detailed data earlier in this thread. > > On Jun 16, 2010, at 10:38 AM, Markus Sabadello wrote: > > I may have an idea for addressing the other concern as well (the time it > takes to parse XDI data). Not sure if this is a priority, but anyway here's > the idea: > > Right now, the XDI server parses every single XRI in an XDI message. While > it sometimes IS important for both XDI servers and clients to fully > "understand" the XRIs (after all that's one of the advantages of XDI), in > many other situations it is enough to internally treat them like strings. > For example, if the XDI server encounters a $get XRI, it only needs to know > that it is $get, but it doesn't need to understand the details that the XRI > consists of a single subsegment with a $ global context symbol. > > So what I am saying is that we could modify the XRI3, XRI3Segment and other > classes to just act as wrappers for java.lang.String, and only actually > invoke the ABNF parser when that is necessary. For common methods such as > equals(), hashCode() and toString() parsing is not necessary, and it should > therefore be possible to save time. > > The one behavioral change from an outside perspective would be that "new > XRI3Segment()" would no longer immediately throw an exception for invalid > XRIs. Not sure if this a problem. The XDI server may end up accepting > messages that actually contain invalid XRIs. > > Hmm on second thought, this may introduce some risks and problems. Maybe > not such a good idea after all :( But anyway, I wanted to quickly post it.. > > Markus > > On Tue, Jun 8, 2010 at 7:00 PM, Mike McIntosh <[email protected]> wrote: > >> Thanks Markus, >> >> >> Sergey and Valery, can you please update and build and check the status of >> the bottleneck problem? >> >> >> Regards, >> >> Michael McIntosh >> >> VP Development >> >> Azigo >> >> >> *From:* [email protected] [mailto: >> [email protected]] *On Behalf Of *Markus Sabadello >> *Sent:* Tuesday, June 08, 2010 12:59 PM >> *To:* [email protected] >> *Cc:* [email protected] >> *Subject:* [higgins-dev] Re: Bottleneck points in PDS service >> >> >> Hello, >> >> I have checked in changes to return fresh XDIReader and XDIWriter >> instances instead of singletons. >> >> Markus >> >> On Fri, May 21, 2010 at 7:44 PM, Markus Sabadello <[email protected]> >> wrote: >> >> Hello Sergey, >> >> On Fri, May 21, 2010 at 4:13 PM, Sergey Lyakhov <[email protected]> >> wrote: >> >> Markus, >> >> I've profiled/debugged PDS service and found two bottlenecks in XDI4J: >> >> 1. The most part of processing time takes xdi4j.xri3.impl.parser.Parser, >> see attached 1_thread.html. >> However this is a class generated from ABNF, and I am not sure there is a >> way to significantly increase its performance. >> >> >> Yes I agree this is probably not possible.. >> The only option here would be to use String instead of XRI3Segment, but >> this would have big implications on the entire library, because in some >> places the functionality of XRI3Segment is really needed. >> >> 2. XDI has multithreading problems. The time of processing for parallel >> threads increases linearly >> the number of threads (see attached 5_threads.html and 10_threads.html). >> >> This occurs because XDIReaderRegistry contains singleton readers, which, >> in turn, are not thread-safe and >> contain synchronized method read(). So, all threads in >> EndpointServlet.readFromBody() method get the same singleton >> instance of XDIReader from XDIReaderRegistry and wait on its read() >> method. We can try to fix that by changing >> XDIReaderRegistry to return a new instance of reader instead of singleton. >> >> >> The reason why I used singletons was that I thought it's better to re-use >> the reader objects instead of creating/destroying them all the time. But >> yes, maybe I was wrong and its better to return new instances for every read >> operation. >> >> Another thing to make sure is that the client sets the header >> Content-Type: text/xdi+x3, so that the XDI server uses the X3StandardReader >> instead of the AutoReader, but I think this is happening already. >> >> Out of curiosity, what software did you use for profiling? >> >> Markus >> >> >> Thanks, >> Sergey Lyakhov >> >> >> >> >> _______________________________________________ >> higgins-dev mailing list >> [email protected] >> https://dev.eclipse.org/mailman/listinfo/higgins-dev >> >> > _______________________________________________ > higgins-dev mailing list > [email protected] > https://dev.eclipse.org/mailman/listinfo/higgins-dev > > > > _______________________________________________ > higgins-dev mailing list > [email protected] > https://dev.eclipse.org/mailman/listinfo/higgins-dev > >
_______________________________________________ higgins-dev mailing list [email protected] https://dev.eclipse.org/mailman/listinfo/higgins-dev
