[jira] Commented: (SOLR-278) LukeRequest/Response for handling show=schema
[ https://issues.apache.org/jira/browse/SOLR-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509156 ] Ryan McKinley commented on SOLR-278: looks good. Do you have suggestions on how to modify SOLR-266? The schema info is different enough (fields etc) that nothing poped out at me... LukeRequest/Response for handling show=schema - Key: SOLR-278 URL: https://issues.apache.org/jira/browse/SOLR-278 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Will Johnson Priority: Minor Fix For: 1.3 Attachments: LukeSchemaHandling.patch the soon to be attached patch adds a method to LukeRequest to set the option for showing schema from SOLR-266. the patch also modifies LukeRepsonse to handle the schema info in the same manner as the fields from the 'normal' luke response. i think it's worth talking about unifying the response format so that they aren't different but that's a larger discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-278) LukeRequest/Response for handling show=schema
[ https://issues.apache.org/jira/browse/SOLR-278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Johnson updated SOLR-278: -- Attachment: LukeSchemaHandling.patch LukeRequest/Response for handling show=schema - Key: SOLR-278 URL: https://issues.apache.org/jira/browse/SOLR-278 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Will Johnson Priority: Minor Fix For: 1.3 Attachments: LukeSchemaHandling.patch the soon to be attached patch adds a method to LukeRequest to set the option for showing schema from SOLR-266. the patch also modifies LukeRepsonse to handle the schema info in the same manner as the fields from the 'normal' luke response. i think it's worth talking about unifying the response format so that they aren't different but that's a larger discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-278) LukeRequest/Response for handling show=schema
LukeRequest/Response for handling show=schema - Key: SOLR-278 URL: https://issues.apache.org/jira/browse/SOLR-278 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Will Johnson Priority: Minor Fix For: 1.3 the soon to be attached patch adds a method to LukeRequest to set the option for showing schema from SOLR-266. the patch also modifies LukeRepsonse to handle the schema info in the same manner as the fields from the 'normal' luke response. i think it's worth talking about unifying the response format so that they aren't different but that's a larger discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-278) LukeRequest/Response for handling show=schema
[ https://issues.apache.org/jira/browse/SOLR-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509160 ] Will Johnson commented on SOLR-278: --- I guess I was hoping for a super set of features in LukeResponse.FieldInfo which will be partially set by the schema and partially set by the luke-ish info. We could be even merge the two if it made sense. In the end I need to get a list of fields that solr currently knows about which seems to be a grouping of both the schema and the index via dynamic fields. The current patch does this but I think there is a better approach somewhere out there. - will LukeRequest/Response for handling show=schema - Key: SOLR-278 URL: https://issues.apache.org/jira/browse/SOLR-278 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Will Johnson Priority: Minor Fix For: 1.3 Attachments: LukeSchemaHandling.patch the soon to be attached patch adds a method to LukeRequest to set the option for showing schema from SOLR-266. the patch also modifies LukeRepsonse to handle the schema info in the same manner as the fields from the 'normal' luke response. i think it's worth talking about unifying the response format so that they aren't different but that's a larger discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-278) LukeRequest/Response for handling show=schema
[ https://issues.apache.org/jira/browse/SOLR-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509164 ] Ryan McKinley commented on SOLR-278: yes, there must a better solution to merge schema vs index field info. I'm open to any suggestions. I added your changes in rev 551971 LukeRequest/Response for handling show=schema - Key: SOLR-278 URL: https://issues.apache.org/jira/browse/SOLR-278 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Will Johnson Priority: Minor Fix For: 1.3 Attachments: LukeSchemaHandling.patch the soon to be attached patch adds a method to LukeRequest to set the option for showing schema from SOLR-266. the patch also modifies LukeRepsonse to handle the schema info in the same manner as the fields from the 'normal' luke response. i think it's worth talking about unifying the response format so that they aren't different but that's a larger discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-279) System Properties for Testing are now in Java code AND Ant build.xml
System Properties for Testing are now in Java code AND Ant build.xml Key: SOLR-279 URL: https://issues.apache.org/jira/browse/SOLR-279 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Eric Pugh Priority: Minor Fix For: 1.3 The system properties can now be pulled out of build.xml due to commit revision 551701 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-279) System Properties for Testing are now in Java code AND Ant build.xml
[ https://issues.apache.org/jira/browse/SOLR-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Pugh updated SOLR-279: --- Attachment: syspropties.patch Patch file for build.xml for removing system properties System Properties for Testing are now in Java code AND Ant build.xml Key: SOLR-279 URL: https://issues.apache.org/jira/browse/SOLR-279 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Eric Pugh Priority: Minor Fix For: 1.3 Attachments: syspropties.patch The system properties can now be pulled out of build.xml due to commit revision 551701 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Thanks for commit 551701!
Yonik, Thanks for commit 551701, I have created bug https:// issues.apache.org/jira/browse/SOLR-279 for removing the properties from build.xml as well. Cheers, Eric --- Principal OpenSource Connections Site: http://www.opensourceconnections.com Blog: http://blog.opensourceconnections.com Cell: 1-434-466-1467
[jira] Resolved: (SOLR-279) System Properties for Testing are now in Java code AND Ant build.xml
[ https://issues.apache.org/jira/browse/SOLR-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-279. --- Resolution: Fixed committed. System Properties for Testing are now in Java code AND Ant build.xml Key: SOLR-279 URL: https://issues.apache.org/jira/browse/SOLR-279 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Eric Pugh Priority: Minor Fix For: 1.3 Attachments: syspropties.patch The system properties can now be pulled out of build.xml due to commit revision 551701 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
stax vs xpp XmlUpdateHandler
I just did some performance testing to compare the stax vs xpp implementaion. As far as I can tell there is no real difference between them. Using solrj, this adds 1 documents for each handler - running each as an independent call. STAX: 8631 8221 8525 8383 8487 = 42247 XPP: 8309 8438 8261 8794 8237 = 42039 How do you all feel about moving: XmlUpdateRequestHandler - XppUpdateRequestHandler StaxUpdateRequestHandler - XmlUpdateRequestHandler then deprecating XppUpdateRequestHandler? This will urge people to use the Stax implemenation sooner then later and should help iron out any problems sooner then later. thoughts? Here is the actual test code: public long makeRequests( String path, int cnt ) throws Exception { server.deleteByQuery( *:* ); server.optimize(); long now = System.currentTimeMillis(); UpdateRequest req = new UpdateRequest(); req.setPath( path ); for( int i=0; icnt; i++ ) { SolrInputDocument doc = new SolrInputDocument(); doc.addField( id, i+ ); doc.addField( name, hello ); for( int x=5; x5; x++ ) { doc.addField( feature, feature:+x ); } req.add( doc ); server.request( req ); req.clear(); } server.commit(); long elapsed = System.currentTimeMillis() - now; QueryResponse response = server.query( new SolrQuery( *:* ) ); if( cnt != response.getResults().getNumFound() ) { throw new Exception( did not add everything! ); } return elapsed; }
Re: stax vs xpp XmlUpdateHandler
: How do you all feel about moving: : XmlUpdateRequestHandler - XppUpdateRequestHandler : StaxUpdateRequestHandler - XmlUpdateRequestHandler : : then deprecating XppUpdateRequestHandler? This will urge people to use : the Stax implemenation sooner then later and should help iron out any : problems sooner then later. I'm kinda out of the looop on the whole Stax/Xpp/Xml update parsing stuff ... am i remembering correctly the end game goal is to reduce/eliminate dependencies on XPP? (because ? stax is Java standard included out-of-the-box with java6? (i'm guessing)) : Here is the actual test code: those are some fairly small documents ... we should probably test out some bigger inputs. A lot of people seem to be sending multiple documents at a time as well, so we should test that use case (ie: add containing 1 small documents; add containg 100 medium documents; add containing 1 big document) for teh purpose of perf teesting the XML parsing it might also make sense to use a schema where every field is ignored (ie: no analysys, no stored values) to help isolate the parsing costs. -Hoss
Re: stax vs xpp XmlUpdateHandler
I'm kinda out of the looop on the whole Stax/Xpp/Xml update parsing stuff ... am i remembering correctly the end game goal is to reduce/eliminate dependencies on XPP? (because ? stax is Java standard included out-of-the-box with java6? (i'm guessing)) For me the biggest reason is to de-couple the parsing from the actual update processing. I need to do custom processing in between (SOLR-269). Stax is a growing standard, so it seems like the right choice if we are reworking document parsing. (depending on your preference) It is a bit easier to work with and more readable. With the parsing separated from indexing, it would be straightforward to have a single UpdateRequestHandler that could read the content type and pick how to parse the documents - using the same indexing strategies/format/processor etc. A lot of people seem to be sending multiple documents at a time as well, so we should test that use case (ie: add containing 1 small documents; add containg 100 medium documents; add containing 1 big document) that makes sense. I don't claim the tests I ran are representative - i just wanted to make sure the overall speeds are within the same ballpark. this one sends 1 docs together (with 10 text fields), then 1 docs individually each with 100 text fields. Still not the most scientific, but here it is: STAX: 57642 XPP: 58012 @Override public void setUp() throws Exception { super.setUp(); // setup the server... server = new EmbeddedSolrServer( SolrCore.getSolrCore() ); } public SolrInputDocument createDocument( int id, int fcnt ) { SolrInputDocument doc = new SolrInputDocument(); doc.addField( id, id+ ); doc.addField( name, hello ); for( int x=5; xfcnt; x++ ) { doc.addField( text, this is just some text with asgasdg; +x ); } return doc; } public long makeRequests( String path, int cnt ) throws Exception { server.deleteByQuery( *:* );// delete everything! server.optimize(); long now = System.currentTimeMillis(); UpdateRequest req = new UpdateRequest(); req.setPath( path ); // Send all the docs together for( int i=0; icnt; i++ ) { req.add( createDocument( i , 10 ) ); } server.request( req ); req.clear(); // Send them one at a time for( int i=0; icnt; i++ ) { req.add( createDocument( i+cnt, 100 ) ); server.request( req ); req.clear(); } server.commit(); long elapsed = System.currentTimeMillis() - now; QueryResponse response = server.query( new SolrQuery( *:* ) ); if( (cnt*2) != response.getResults().getNumFound() ) { throw new Exception( did not add everything! ); } return elapsed; } /** * query the example */ public void testExampleConfig() throws Exception { // Empty the database... long time = makeRequests( /update, 1 ); System.out.println( time: + time); }
Re: stax vs xpp XmlUpdateHandler
On 6/29/07, Ryan McKinley [EMAIL PROTECTED] wrote: How do you all feel about moving: XmlUpdateRequestHandler - XppUpdateRequestHandler StaxUpdateRequestHandler - XmlUpdateRequestHandler then deprecating XppUpdateRequestHandler? +1 I think we could remove the XppUpdateRequestHandler relatively quickly to get rid of the XPP dependency. It's more of an implementation detail and shouldn't be visible to most users. -Yonik
Re: stax vs xpp XmlUpdateHandler
so we should test that use case (ie: add containing 1 small documents; For processing a single request with 1 documents, the existing XPP update handler is faster then the new StaxUpdateHandler. XPP: 6888 6714 STAX: 8665 8313 I looked into it, and the difference seems to be entirely in the logging strategy. the XPP handler prints out a single line for all 10K docs: INFO: added id={0,1,2,3,4,5,6,7,8,9,10,11,12,13,1... the STAX one sends each document to the processor that logs the add individually: INFO: added id={0} in 28ms [29] INFO: added id={1} in 3ms [33] INFO: added id={2} in 1ms [35] ... If I remove logging, the same test runs in: STAX: 6783 6834 essentially equivalent to the XPP version ryan
Re: stax vs xpp XmlUpdateHandler
On 6/29/07, Ryan McKinley [EMAIL PROTECTED] wrote: If I remove logging, the same test runs in: STAX: 6783 6834 essentially equivalent to the XPP version What about if you remove the logging for the XPP version too? -Yonik
Re: stax vs xpp XmlUpdateHandler
On 6/29/07, Ryan McKinley [EMAIL PROTECTED] wrote: so we should test that use case (ie: add containing 1 small documents; For processing a single request with 1 documents, the existing XPP update handler is faster then the new StaxUpdateHandler. XPP: 6888 6714 STAX: 8665 8313 Have you tried Woodstox to see how it compares? If you do more testing, in addition to Hoss' recommendations, I'd also remove the unused elements (copyFields, dynamicFields) from the schema (as well as testing w/o logging). -Yonik