[jira] Commented: (SOLR-129) Solrb - UTF 8 Support for add/delete
[ https://issues.apache.org/jira/browse/SOLR-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468920 ] Antonio Eggberg commented on SOLR-129: -- Please close this bug. I have found the problem. For those of you who might be wondering why do you see strange char in flare index page.. cos you are in debug mode :-) If I read the code a bit more carefully :-.. anyway turn off debug in your app/view pages. However the problem of post i.e. add document.. still exist. this is above my java expertise .. so here is the error log... SEVERE: org.xmlpull.v1.XmlPullParserException: could not resolve entity named 'aring' (position: START_TAG seen ...field name=\'description_text\'Tvaring;... @1:115) at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1282) at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093) at org.xmlpull.mxp1.MXParser.nextText(MXParser.java:1058) at org.apache.solr.core.SolrCore.readDoc(SolrCore.java:927) at org.apache.solr.core.SolrCore.update(SolrCore.java:720) at org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:53) at javax.servlet.http.HttpServlet.service(HttpServlet.java:616) at javax.servlet.http.HttpServlet.service(HttpServlet.java:689) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:428) at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:473) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:568) at org.mortbay.http.HttpContext.handle(HttpContext.java:1530) at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:633) at org.mortbay.http.HttpContext.handle(HttpContext.java:1482) at org.mortbay.http.HttpServer.service(HttpServer.java:909) at org.mortbay.http.HttpConnection.service(HttpConnection.java:820) at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:986) at org.mortbay.http.HttpConnection.handle(HttpConnection.java:837) at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:245) at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534) Solrb - UTF 8 Support for add/delete Key: SOLR-129 URL: https://issues.apache.org/jira/browse/SOLR-129 Project: Solr Issue Type: Bug Components: clients - ruby - flare Environment: OSX Reporter: Antonio Eggberg Hi: This could be a ruby utf-8 bug. Anyway when I try to do a UTF-8 document add via post.sh and then do query via Solr Admin everything works as it should. However using the solrb ruby lib or flare UTF-8 doc add doesn't work as it should. I am not sure what I am doing wrong and I don't think its Solr cos it works as it should. Could this be a famous utf-8 ruby bug? I am using ruby 1.8.5 with rails 1.2.1 Cheers -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469074 ] Thorsten Scherler commented on SOLR-85: --- Hi Ryan, sorry for coming back so late on this, but I need to finish up the first version of a customer project. Anyway, I saw that SOLR-104 is now applied meaning your last patch on this issue should work fine, right. Are they any other blocker on this issue? salu2 [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-61) move XML update parsing out of SolrCore
[ https://issues.apache.org/jira/browse/SOLR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469076 ] Thorsten Scherler commented on SOLR-61: --- Hi all, I am keen to give this issue a go, somebody can give some hints where to start. TIA salu2 move XML update parsing out of SolrCore --- Key: SOLR-61 URL: https://issues.apache.org/jira/browse/SOLR-61 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Priority: Minor The XML parsing in SolrCore should be decoupled and moved out. We also might consider moving to StAX based parsing, as it is now a standard and will be included in Java6 (Woodstox could be used for Java5). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance
[Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance - Key: SOLR-130 URL: https://issues.apache.org/jira/browse/SOLR-130 Project: Solr Issue Type: Task Reporter: Thorsten Scherler While developing a custom search server based on solr I took some notes about the do's and don'ts. The initial patch is not a fully finished document but may invite other devs to enhance it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
JIRA - adding docu component?
Hi all, I wonder whether we could add a docu component to our jira instance? wdyt? salu2 -- Thorsten Scherler thorsten.at.apache.org Open Source Java XML consulting, training and solutions
Re: JIRA - adding docu component?
On 1/31/07, Thorsten Scherler [EMAIL PROTECTED] wrote: I wonder whether we could add a docu component to our jira instance? Done. -Yonik
[jira] Commented: (SOLR-61) move XML update parsing out of SolrCore
[ https://issues.apache.org/jira/browse/SOLR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469108 ] Ryan McKinley commented on SOLR-61: --- in SOLR104, xml parsing moved from SolrCore to XmlUpdateRequestHandler http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java move XML update parsing out of SolrCore --- Key: SOLR-61 URL: https://issues.apache.org/jira/browse/SOLR-61 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Priority: Minor The XML parsing in SolrCore should be decoupled and moved out. We also might consider moving to StAX based parsing, as it is now a standard and will be included in Java6 (Woodstox could be used for Java5). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469104 ] Ryan McKinley commented on SOLR-85: --- the last patch (solr-85-with-104.patch) should work fine no blocker issues ryan [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance
[ https://issues.apache.org/jira/browse/SOLR-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469121 ] Antonio Eggberg commented on SOLR-130: -- Wow! you must be reading my mind :-) I can contribute with questions :-) As a newbie non-java user from an enterprise prospective! I am your idea target :-) Having said that I like to know about the following: 1. The schema.xml and solrconfig.xml are in parts very well explained. But in some areas like as an example .. indexDefaults and other places there are no explanation. It would be nice to get more info there. Specifically for example if increase mergeFactor to 1000 what will happen? what are the highest value for each properties? what is for example a safe value. 2. It would be nice to create a deployment scenarios i.e a single server install with XXX CPU and YYY memory just running Solr with AAA thousand docs how should your config look like and why? and you can get about xxx Query/Sec or something.. 3. It would be nice to have a multi server deployment with some server spec and then how should the deployment be. 4. It would also be nice to have more info regarding stopwords synonoms etc. usage and facet etc.. I know that all of the above are case by case cos configuration by default means case by case. But what I want to propose is a Guidelines or Best Practice based on your production implementation/deployment you have done with Cocoon. It would be nice to have some real world stories. I think you should do like the subversion book! - A Solr open source book! :-) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance - Key: SOLR-130 URL: https://issues.apache.org/jira/browse/SOLR-130 Project: Solr Issue Type: Task Reporter: Thorsten Scherler Attachments: SOLR-130.diff While developing a custom search server based on solr I took some notes about the do's and don'ts. The initial patch is not a fully finished document but may invite other devs to enhance it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-109) variable substitution in lucene query params
[ https://issues.apache.org/jira/browse/SOLR-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thorsten Scherler updated SOLR-109: --- Attachment: SOLR-109.diff This is a first start. What still is missing is ... a more general solution might be to modify the SolrQueryParser directly to have a new void setParamVariables(SolrParams p) method. if it's called (with non null input), then any string that SolrQueryParser instance is asked to parse would first be preprocessed looking for the ${} pattern and pulling the values out of the SOlrParams instance. I need to have a closer look on what Hoss means exactly with this. However I get lots of error after an svn up and I am not sure whether my local changes has caused this. variable substitution in lucene query params Key: SOLR-109 URL: https://issues.apache.org/jira/browse/SOLR-109 Project: Solr Issue Type: New Feature Reporter: Thorsten Scherler Attachments: SOLR-109.diff Allowing variable substitution in the lucene query params seems pretty slick ... a more general solution might be to modify the SolrQueryParser directly to have a new void setParamVariables(SolrParams p) method. http://marc.theaimsgroup.com/?t=11671237641r=1w=2 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
/update/xml dropping exceptions
I haven't looked into it yet, but it seems like any problems in a request to /update/xml get lost somewhere... a positive response is always returned. -Yonik
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469145 ] Yonik Seeley commented on SOLR-85: -- Ryan, see Thorsten's last patch: solar-85.with.file.upload.diff that addressed some previous comments (separate update page, able to be disabled from solrconfig, etc) [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469156 ] Yonik Seeley commented on SOLR-85: -- If you click on Manage Attachments (do you have that link?) it shows the date each attachment was added. That's why I prefer versions of a patch all added under the same name... then JIRA takes care of telling me which is newest by graying out the old ones. [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-126) Auto-commit documents after time interval
[ https://issues.apache.org/jira/browse/SOLR-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469187 ] Mike Klaas commented on SOLR-126: - Ryan: looking good! A few comments: - You notify the tracker that the document is added before actually adding the document. This is okay--commit() cannot run until addDoc() is complete--but it does mean that the autocommit maxTime is measured from the start of the document being added until after it has been processed. I'm not sure it matters in practice. - similarly, didCommit() is invoked before the searcher is warmed. Autocommits will never occur simulatneously (as you note; due to synchronization of run()), but they could be invoked continually if warming takes a long time. - If 250ms is a small enough time to not care about, does it make sense to force the user to specify the time in milliseconds? These are all relatively minor things--if no one else has any thoughts this can probably be committed soon. Auto-commit documents after time interval - Key: SOLR-126 URL: https://issues.apache.org/jira/browse/SOLR-126 Project: Solr Issue Type: Improvement Components: update Reporter: Ryan McKinley Priority: Minor Attachments: AutoCommit.patch, AutocommitingUpdateRequestHandler.patch If an index is getting updated from multiple sources and needs to add documents reasonably quickly, there should be a good solr side mechanism to help prevent the client from spawning multiple overlapping commit/ commands. My specific use case is sending each document to solr every time hibernate saves an object (see SOLR-20). This happens from multiple machines simultaneously. I'd like solr to make sure the documents are committed within a second. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-126) Auto-commit documents after time interval
[ https://issues.apache.org/jira/browse/SOLR-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469204 ] Ryan McKinley commented on SOLR-126: - You notify the tracker that the document is added before actually adding the document. This is okay--commit() cannot run until addDoc() is complete--but it does mean that the autocommit maxTime is measured from the start of the document being added until after it has been processed. I'm not sure it matters in practice. I'm looking at it from the client perspective. The timer should start as soon as close to the request time as possible. - similarly, didCommit() is invoked before the searcher is warmed. Autocommits will never occur simulatneously (as you note; due to synchronization of run()), but they could be invoked continually if warming takes a long time. I just left at were it was in the existing code. I think it makes sense because the searcher has the proper data at that point - a second commit wont change the results. Also, it will not start a new autocommit until the first has warmed the searcher anyway: CommitUpdateCommand command = new CommitUpdateCommand( false ); command.waitFlush = true; command.waitSearcher = true; - If 250ms is a small enough time to not care about, does it make sense to force the user to specify the time in milliseconds? This is trying to avoid is the case where 100 documents are added at the same time with maxDocs=10. We don't want to commit 10 times, so it waits 1/4 sec. (could be shorter or longer in my opinion) If anyone is worried about the timing, they should use maxTime, not maxDocs Auto-commit documents after time interval - Key: SOLR-126 URL: https://issues.apache.org/jira/browse/SOLR-126 Project: Solr Issue Type: Improvement Components: update Reporter: Ryan McKinley Priority: Minor Attachments: AutoCommit.patch, AutocommitingUpdateRequestHandler.patch If an index is getting updated from multiple sources and needs to add documents reasonably quickly, there should be a good solr side mechanism to help prevent the client from spawning multiple overlapping commit/ commands. My specific use case is sending each document to solr every time hibernate saves an object (see SOLR-20). This happens from multiple machines simultaneously. I'd like solr to make sure the documents are committed within a second. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance
[ https://issues.apache.org/jira/browse/SOLR-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-130: -- Component/s: documentation [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance - Key: SOLR-130 URL: https://issues.apache.org/jira/browse/SOLR-130 Project: Solr Issue Type: Task Components: documentation Reporter: Thorsten Scherler Attachments: SOLR-130.diff While developing a custom search server based on solr I took some notes about the do's and don'ts. The initial patch is not a fully finished document but may invite other devs to enhance it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
empty contentStream?
I'm trying to implement SOLR-85 using SOLR-104 content streams... but it raises a simple behavior question. If you have a form: form textarea name=stream.body /textarea input type=file name=file/ /form If you upload a file, the update plugin is sent two content streams: one with the contents of the file, the other with contents . As written the XmlUpdateHandler parses each stream and breaks when it hits the empty string. Options: 1. this should be implemented with two forms - every field sent should be used 2. if stream.body.trim().length() == 0, don't make a stream I vote for #2, thoughts?
Re: loading many documents by ID
On Jan 31, 2007, at 6:39 PM, Chris Hostetter wrote: : Oh, and there have been numerous people interested in updateable : documents, so it would be nice if that part was in the update handler. We'd have to make it very clear that this only works if all fields are STORED. That is perfectly reasonable, for sure. And I would support an update feature issuing an exception if it detected this case. There is an important caveat to all fields being stored though... if an update was sending in updated fields for all the non-stored fields, and only stored fields were being copied internally, all would be fine too. I think eventually we could have this sort of feature internally copy the terms for non-stored fields somehow, but maybe that would only come along once Lucene supported something to facilitate this more? Erik
Re: [jira] Created: (SOLR-131) tutorial update: faceting, highlighting, etc
What about putting the tutorial completely on the wiki? We could pull the wiki page into a distribution to lock it in statically. Just a thought. I like it being off the wiki actually, but with the wiki anyone can lend a hand in wordsmithing and updating. Erik On Jan 31, 2007, at 9:31 PM, Yonik Seeley (JIRA) wrote: tutorial update: faceting, highlighting, etc Key: SOLR-131 URL: https://issues.apache.org/jira/browse/SOLR-131 Project: Solr Issue Type: Improvement Components: documentation Reporter: Yonik Seeley The tutorial hasn't really been changed since we entered the incubator. Highlighting and Faceting might be nice additions. Looking back, I wish I had chosen a different data set like books or movies (or a mix of both)... something that wouldn't get out of date as fast as electronics, and that more people could identify with. The biggest downside is examples in the Wiki refer to the current example docs. breaking into multiple pages, and a screenshot or two wouldn't be bad idea either. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: empty contentStream?
On 1/31/07, Ryan McKinley [EMAIL PROTECTED] wrote: Options: 1. this should be implemented with two forms - every field sent should be used 2. if stream.body.trim().length() == 0, don't make a stream I vote for #2, thoughts? Sigh... yes, it's practical. -Yonmik
resin and UTF-8 in URLs
So, we've conquered UTF-8 input in URLs for Jetty and Tomcat, so how about Resin? Right now, I can't get Resin 3.0.22 to see an e with a circumflex via the following: curl -i 'http://localhost:8983/solr/select?q=%C3%AAechoParams=explicit' -Yonik
[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen
[ https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469325 ] Ryan McKinley commented on SOLR-85: --- Ok, this one is based on solar-85.with.file.upload.diff! It also adds a few minor fixes / adjustments to SOLR-104 [PATCH] Add update form to the admin screen --- Key: SOLR-85 URL: https://issues.apache.org/jira/browse/SOLR-85 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Attachments: solar-85.png, solar-85.png, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, solr-85-with-104.patch, solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff It would be nice to have a webform to update solr via a http interface instead of using the post.sh. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: loading many documents by ID
On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote: On Jan 31, 2007, at 6:39 PM, Chris Hostetter wrote: : Oh, and there have been numerous people interested in updateable : documents, so it would be nice if that part was in the update handler. We'd have to make it very clear that this only works if all fields are STORED. That is perfectly reasonable, for sure. And I would support an update feature issuing an exception if it detected this case. There is an important caveat to all fields being stored though... if an update was sending in updated fields for all the non-stored fields, and only stored fields were being copied internally, all would be fine too. I think there might be two useful types of updates: 1) overwrite original field 2) add an additional value for a multi-valued field (useful for tagging?) I think eventually we could have this sort of feature internally copy the terms for non-stored fields somehow, but maybe that would only come along once Lucene supported something to facilitate this more? Not unless you store more info (a lot more info). We sould also be able to copy unstored fields with term vectors stored. ParallelReader might also hold some promise (putting a field to be updated in a separate index) The problem is that the lucene ids need to be kept in sync... I don't know how to do that w/o reindexing. -Yonik
Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa
TODO: switch solrb to using wt=json instead of wt=ruby. Whatcha think, Ed et al? Erik On Jan 30, 2007, at 1:36 PM, [EMAIL PROTECTED] wrote: Author: yonik Date: Tue Jan 30 10:36:32 2007 New Revision: 501512 URL: http://svn.apache.org/viewvc?view=revrev=501512 Log: SimpleOrderedMap, JSON named list changes: SOLR-125
Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa
On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote: TODO: switch solrb to using wt=json instead of wt=ruby. Why is that? -Yonik
charset in POST from browser
It seems that browsers do a form POST in the charset that the page was encoded in. Modifying form.jsp in solr/admin seems to work... the data comes across encoded in UTF8. The problem is that the charset isn't defined to be UTF-8 in the headers, so the bytes are assumed to be latin-1. Is this a problem we can fix in solr, or is it purely container config? This will mimic what the browser sends back: curl -i http://localhost:8983/solr/select -d 'q=%C3%AA' -Yonik
Re: loading many documents by ID
On 1/31/07 3:39 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : Oh, and there have been numerous people interested in updateable : documents, so it would be nice if that part was in the update handler. We'd have to make it very clear that this only works if all fields are STORED. Isn't there some way to do this automatically instead of relying on documentation? We might need to add something, maybe a required attribute on fields, but a runtime error would be much, much better than a page on the wiki. wunder
Re: loading many documents by ID
We'd have to make it very clear that this only works if all fields are STORED. Isn't there some way to do this automatically instead of relying on documentation? We might need to add something, maybe a required attribute on fields, but a runtime error would be much, much better than a page on the wiki. what about copyField? With copyField, it is reasonable to have fields that are not stored and are generated from the other stored fields. (this is what my setup looks like)
[jira] Closed: (SOLR-129) Solrb - UTF 8 Support for add/delete
[ https://issues.apache.org/jira/browse/SOLR-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher closed SOLR-129. - Resolution: Cannot Reproduce I added a controller and view to display the features from the utf8-example.xml file to flare. 1) fire up the Solr example application, and post.sh *.xml from the exampledocs directory. 2) fire up flare, hit /i18n (http://localhost:3000/i18n Showing all the accented characters worked fine for me. I suspect we probably still have some i18n issues to iron out, so any help or at least test cases in that regard would be most helpful. Solrb - UTF 8 Support for add/delete Key: SOLR-129 URL: https://issues.apache.org/jira/browse/SOLR-129 Project: Solr Issue Type: Bug Components: clients - ruby - flare Environment: OSX Reporter: Antonio Eggberg Hi: This could be a ruby utf-8 bug. Anyway when I try to do a UTF-8 document add via post.sh and then do query via Solr Admin everything works as it should. However using the solrb ruby lib or flare UTF-8 doc add doesn't work as it should. I am not sure what I am doing wrong and I don't think its Solr cos it works as it should. Could this be a famous utf-8 ruby bug? I am using ruby 1.8.5 with rails 1.2.1 Cheers -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: loading many documents by ID
On 1/31/07 9:05 PM, Ryan McKinley [EMAIL PROTECTED] wrote: We'd have to make it very clear that this only works if all fields are STORED. Isn't there some way to do this automatically instead of relying on documentation? We might need to add something, maybe a required attribute on fields, but a runtime error would be much, much better than a page on the wiki. what about copyField? With copyField, it is reasonable to have fields that are not stored and are generated from the other stored fields. (this is what my setup looks like). Mine, too. That is why I suggested explicit declarations in the schema to say which fields are required. wunder
Re: empty contentStream?
: 1. this should be implemented with two forms - every field sent should be used : 2. if stream.body.trim().length() == 0, don't make a stream : : I vote for #2, thoughts? : : Sigh... yes, it's practical. Alternate Idea #3: make the XmlUpdateRequestHandler more robust in recieving empty streams (treat it as a NOOP, maybe return an error if *all* the streams are empty) i'm okay with #2 as long as it's only in the stream.body parsing and not something we try to do with every stream. -Hoss
Re: charset in POST from browser
On 2/1/07, Chris Hostetter [EMAIL PROTECTED] wrote: : The problem is that the charset isn't defined to be UTF-8 in the : headers, so the bytes are assumed to be latin-1. : : Is this a problem we can fix in solr, or is it purely container config? umm... we already fixed this the best way i know how in SOLR-35 ... all of the JSPs that have forms should have this in them... %@ page contentType=text/html; charset=utf-8 pageEncoding=UTF-8% ...is resin not respecting that? The form that gets sent to the browser is in UTF8, and the browser correctly sends back UTF8 in the post body. *But* the browser doesn't tell the container what the charset of the body is, so it's up to the container to guess. By default, resin seems to pick latin-1. It seems like we should assume UTF-8 if no charset is sent for a text content type. -Yonik
Re: resin and UTF-8 in URLs
I just tried this on two systems... it worked on one (I got the ê) and the other I get ê -- both running resin 3.0.21 The one that works has http://securityfilter.sourceforge.net/ applied. I'll look into what securityfilter is doing... it may be setting something explicitly
Re: empty contentStream?
I just posted SOLR-85 using strategy #2. It makes sure stream.body and stream.url have content before making streams out of them. I think this makes sense given they are likely to be used in forms similar to the 'update.jsp' where they may or may not have content. i'm okay with #2 as long as it's only in the stream.body parsing and not something we try to do with every stream. I totally agree it should not check 'real' streams, but these are essentially helper streams that make it easy to post a stream from a form.
Re: resin and UTF-8 in URLs
On 2/1/07, Ryan McKinley [EMAIL PROTECTED] wrote: I just tried this on two systems... it worked on one (I got the ê) and the other I get ê -- both running resin 3.0.21 A co-worker informed me that adding a character-encoding attribute to the web-app tag in web.xml will force a charset if not defined. Seems to work for both GET and POST. web-app character-encoding=utf-8 This looks resin-specific though. -Yonik
Re: charset in POST from browser
: The form that gets sent to the browser is in UTF8, and the browser : correctly sends back UTF8 in the post body. *But* the browser doesn't : tell the container what the charset of the body is, so it's up to the : container to guess. By default, resin seems to pick latin-1. That's really weird ... i could have sworn browsers doing POST of form data were suppose to sent a full content-type... Content-type: application/x-www-form-urlencoded; charset=utf-8 ...picking the charset based on the charset of the page containing the form (i assume you tested and verified this isn't happening?) a quick google search turned up this page, with this info... http://www.systemvikar.biz/faq/servlet.xtp Form character encoding doesn't work A POST request with application/x-www-form-urlencoded doesn't contain any information about the character request. So Resin needs to use a set of heuristics to decode the form. Here's the order: 1. request.getAttribute(caucho.form.character.encoding) 2. The response.setContentType() encoding of the page. 3. The character-encoding tag in the resin.conf. Resin uses the default character encoding of your JVM to read form data. To set the encoding to another charset, you'll need to change the resin.conf as follows: http-server character-encoding='Shift_JIS' ... /http-server
Re: empty contentStream?
: It makes sure stream.body and stream.url have content before making : streams out of them. I think this makes sense given they are likely : to be used in forms similar to the 'update.jsp' where they may or may : not have content. yeah ... good call. -Hoss
Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa
On Jan 31, 2007, at 11:08 PM, Yonik Seeley wrote: On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote: TODO: switch solrb to using wt=json instead of wt=ruby. Why is that? To benefit from a richer data structure, avoid eval (which I hear is likely to be slower than parsing JSON, and eval is potentially more dangerous if code somehow got slipped in though that risk is not very high). The downside is that we'd need to add a dependency on a JSON parsing library. JSON is close enough to Ruby syntax that it can practically be eval'd, interestingly, but I don't think it's close enough. Erik
Re: charset in POST from browser
On 2/1/07, Chris Hostetter [EMAIL PROTECTED] wrote: : The form that gets sent to the browser is in UTF8, and the browser : correctly sends back UTF8 in the post body. *But* the browser doesn't : tell the container what the charset of the body is, so it's up to the : container to guess. By default, resin seems to pick latin-1. That's really weird ... i could have sworn browsers doing POST of form data were suppose to sent a full content-type... Content-type: application/x-www-form-urlencoded; charset=utf-8 ...picking the charset based on the charset of the page containing the form (i assume you tested and verified this isn't happening?) Yep, FireFox2. I'd serve the page, do a search, kill the solr server, run nc -l -p 8983, and run the search again. The body was encoded correctly, but just no charset info. I tried setting it explicitly by appending to enctype in the form, but it doesn't go through. -Yonik
Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa
On 2/1/07, Erik Hatcher [EMAIL PROTECTED] wrote: On Jan 31, 2007, at 11:08 PM, Yonik Seeley wrote: On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote: TODO: switch solrb to using wt=json instead of wt=ruby. Why is that? To benefit from a richer data structure, They seem to have the same power there. Bear in mind that json params like json.nl apply to it's subtypes, ruby and python also. avoid eval (which I hear is likely to be slower than parsing JSON, If the JSON parser is written in C, yes. Otherwise, I doubt it :-) and eval is potentially more dangerous if code somehow got slipped in though that risk is not very high). Yeah, I guess someone would have to say, here, point your client at my solr system, and then they could be running something else that gives you executable code. But they could also just give you bogus data, so it's bad to point at random things anyway. (but I guess it it *is* worse if you are trying to operate in some federated mode across the internet with unknown peers). The downside is that we'd need to add a dependency on a JSON parsing library. JSON is close enough to Ruby syntax that it can practically be eval'd, interestingly, but I don't think it's close enough. Erik