date:20070131


[ 
https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469074
 ] 

Thorsten Scherler commented on SOLR-85:
---

Hi Ryan,

sorry for coming back so late on this, but I need to finish up the first 
version of a customer project.

Anyway, I saw that SOLR-104 is now applied meaning your last patch on this 
issue should work fine, right.

Are they any other blocker on this issue?

salu2

 [PATCH] Add update form to the admin screen
 ---

 Key: SOLR-85
 URL: https://issues.apache.org/jira/browse/SOLR-85
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Thorsten Scherler
 Attachments: solar-85.png, solar-85.png, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff


 It would be nice to have a webform to update solr via a http interface 
 instead of using the post.sh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-61) move XML update parsing out of SolrCore


[ 
https://issues.apache.org/jira/browse/SOLR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469076
 ] 

Thorsten Scherler commented on SOLR-61:
---

Hi all,

I am keen to give this issue a go, somebody can give some hints where to start.

TIA

salu2

 move XML update parsing out of SolrCore
 ---

 Key: SOLR-61
 URL: https://issues.apache.org/jira/browse/SOLR-61
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
Priority: Minor

 The XML parsing in SolrCore should be decoupled and moved out.
 We also might consider moving to StAX based parsing, as it is now a standard 
 and will be included in Java6 (Woodstox could be used for Java5).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance

[Patch] [Docu] Starting a mySolr document, which tries to explain how to setup 
a custom solr instance
-

 Key: SOLR-130
 URL: https://issues.apache.org/jira/browse/SOLR-130
 Project: Solr
  Issue Type: Task
Reporter: Thorsten Scherler


While developing a custom search server based on solr I took some notes about 
the do's and don'ts. The initial patch is not a fully finished document but may 
invite other devs to enhance it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

JIRA - adding docu component?

2007-01-31 Thread Thorsten Scherler

Hi all,

I wonder whether we could add a docu component to our jira instance?

wdyt?

salu2
-- 
Thorsten Scherler   thorsten.at.apache.org
Open Source Java  XML  consulting, training and solutions

Re: JIRA - adding docu component?


On 1/31/07, Thorsten Scherler [EMAIL PROTECTED] wrote:

I wonder whether we could add a docu component to our jira instance?


Done.

-Yonik

[jira] Commented: (SOLR-61) move XML update parsing out of SolrCore


[ 
https://issues.apache.org/jira/browse/SOLR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469108
 ] 

Ryan McKinley commented on SOLR-61:
---

in SOLR104, xml parsing moved from SolrCore to XmlUpdateRequestHandler

http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java

 move XML update parsing out of SolrCore
 ---

 Key: SOLR-61
 URL: https://issues.apache.org/jira/browse/SOLR-61
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
Priority: Minor

 The XML parsing in SolrCore should be decoupled and moved out.
 We also might consider moving to StAX based parsing, as it is now a standard 
 and will be included in Java6 (Woodstox could be used for Java5).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen


[ 
https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469104
 ] 

Ryan McKinley commented on SOLR-85:
---

the last patch (solr-85-with-104.patch) should work fine

no blocker issues

ryan

 [PATCH] Add update form to the admin screen
 ---

 Key: SOLR-85
 URL: https://issues.apache.org/jira/browse/SOLR-85
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Thorsten Scherler
 Attachments: solar-85.png, solar-85.png, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff


 It would be nice to have a webform to update solr via a http interface 
 instead of using the post.sh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance

2007-01-31 Thread Antonio Eggberg (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469121
]

Antonio Eggberg commented on SOLR-130:
--

Wow! you must be reading my mind :-) I can contribute with questions :-) As a
newbie non-java user from an enterprise prospective! I am your idea target :-)
Having said that I like to know about the following:

1. The schema.xml and solrconfig.xml are in parts very well explained. But in
some areas like as an example .. indexDefaults and other places there are no
explanation. It would be nice to get more info there. Specifically for example
if increase mergeFactor to 1000 what will happen? what are the highest value
for each properties? what is for example a safe value.
2. It would be nice to create a deployment scenarios i.e a single server
install with XXX CPU and YYY memory just running Solr with AAA thousand docs
how should your config look like and why? and you can get about xxx Query/Sec
or something..
3. It would be nice to have a multi server deployment with some server spec and
then how should the deployment be.
4. It would also be nice to have more info regarding stopwords synonoms etc.
usage and facet etc..

I know that all of the above are case by case cos configuration by default
means case by case. But what I want to propose is a Guidelines or Best
Practice based on your production implementation/deployment you have done with
Cocoon. It would be nice to have some real world stories.

I think you should do like the subversion book! - A Solr open source book! :-)

[Patch] [Docu] Starting a mySolr document, which tries to explain how to
setup a custom solr instance
-

Key: SOLR-130
URL: https://issues.apache.org/jira/browse/SOLR-130
Project: Solr
Issue Type: Task
Reporter: Thorsten Scherler
Attachments: SOLR-130.diff

While developing a custom search server based on solr I took some notes about
the do's and don'ts. The initial patch is not a fully finished document but
may invite other devs to enhance it.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-109) variable substitution in lucene query params


 [ 
https://issues.apache.org/jira/browse/SOLR-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thorsten Scherler updated SOLR-109:
---

Attachment: SOLR-109.diff

This is a first start.

What still is missing is ... a more general solution might be to modify the 
SolrQueryParser
directly to have a new void setParamVariables(SolrParams p) method.  if
it's called (with non null input), then any string that SolrQueryParser
instance is asked to parse would first be preprocessed looking for the ${}
pattern and pulling the values out of the SOlrParams instance.

I need to have a closer look on what Hoss means exactly with this. However I 
get lots of error after an svn up and I am not sure whether my local changes 
has caused this.

 variable substitution in lucene query params
 

 Key: SOLR-109
 URL: https://issues.apache.org/jira/browse/SOLR-109
 Project: Solr
  Issue Type: New Feature
Reporter: Thorsten Scherler
 Attachments: SOLR-109.diff


 Allowing variable substitution in the lucene query params seems pretty slick 
 ... a more general solution might be to modify the SolrQueryParser
 directly to have a new void setParamVariables(SolrParams p) method.
 http://marc.theaimsgroup.com/?t=11671237641r=1w=2

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

/update/xml dropping exceptions


I haven't looked into it yet, but it seems like any problems in a
request to /update/xml get lost somewhere... a positive response is
always returned.

-Yonik

[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen

2007-01-31 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469145
 ] 

Yonik Seeley commented on SOLR-85:
--

Ryan, see Thorsten's last patch:  solar-85.with.file.upload.diff
that addressed some previous comments (separate update page, able to be 
disabled from solrconfig, etc)


 [PATCH] Add update form to the admin screen
 ---

 Key: SOLR-85
 URL: https://issues.apache.org/jira/browse/SOLR-85
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Thorsten Scherler
 Attachments: solar-85.png, solar-85.png, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff


 It would be nice to have a webform to update solr via a http interface 
 instead of using the post.sh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen

2007-01-31 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469156
 ] 

Yonik Seeley commented on SOLR-85:
--

If you click on Manage Attachments (do you have that link?) it shows the date 
each attachment was added.
That's why I prefer versions of a patch all added under the same name... then 
JIRA takes care of telling me which is newest by graying out the old ones.

 [PATCH] Add update form to the admin screen
 ---

 Key: SOLR-85
 URL: https://issues.apache.org/jira/browse/SOLR-85
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Thorsten Scherler
 Attachments: solar-85.png, solar-85.png, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solr-85-with-104.patch, solr-85.diff, solr-85.diff, solr-85.FINAL.diff


 It would be nice to have a webform to update solr via a http interface 
 instead of using the post.sh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-126) Auto-commit documents after time interval

2007-01-31 Thread Mike Klaas (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469187
]

Mike Klaas commented on SOLR-126:
-

Ryan: looking good! A few comments:

- You notify the tracker that the document is added before actually adding the
document. This is okay--commit() cannot run until addDoc() is complete--but it
does mean that the autocommit maxTime is measured from the start of the
document being added until after it has been processed. I'm not sure it
matters in practice.

- similarly, didCommit() is invoked before the searcher is warmed. Autocommits
will never occur simulatneously (as you note; due to synchronization of run()),
but they could be invoked continually if warming takes a long time.

- If 250ms is a small enough time to not care about, does it make sense to
force the user to specify the time in milliseconds?

These are all relatively minor things--if no one else has any thoughts this can
probably be committed soon.

Auto-commit documents after time interval
-

Key: SOLR-126
URL: https://issues.apache.org/jira/browse/SOLR-126
Project: Solr
Issue Type: Improvement
Components: update
Reporter: Ryan McKinley
Priority: Minor
Attachments: AutoCommit.patch, AutocommitingUpdateRequestHandler.patch

If an index is getting updated from multiple sources and needs to add
documents reasonably quickly, there should be a good solr side mechanism to
help prevent the client from spawning multiple overlapping commit/ commands.
My specific use case is sending each document to solr every time hibernate
saves an object (see SOLR-20). This happens from multiple machines
simultaneously. I'd like solr to make sure the documents are committed
within a second.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-126) Auto-commit documents after time interval

[
https://issues.apache.org/jira/browse/SOLR-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469204
]

Ryan McKinley commented on SOLR-126:

- You notify the tracker that the document is added before actually adding
the document. This is okay--commit() cannot run until addDoc() is
complete--but it does mean that the autocommit maxTime is measured from the
start of the document being added until after it has been processed. I'm not
sure it matters in practice.

I'm looking at it from the client perspective. The timer should start as soon
as close to the request time as possible.

- similarly, didCommit() is invoked before the searcher is warmed.
Autocommits will never occur simulatneously (as you note; due to
synchronization of run()), but they could be invoked continually if warming
takes a long time.

I just left at were it was in the existing code. I think it makes sense
because the searcher has the proper data at that point - a second commit wont
change the results.

Also, it will not start a new autocommit until the first has warmed the
searcher anyway:

CommitUpdateCommand command = new CommitUpdateCommand( false );
command.waitFlush = true;
command.waitSearcher = true;

- If 250ms is a small enough time to not care about, does it make sense to
force the user to specify the time in milliseconds?

This is trying to avoid is the case where 100 documents are added at the same
time with maxDocs=10. We don't want to commit 10 times, so it waits 1/4 sec.
(could be shorter or longer in my opinion)

If anyone is worried about the timing, they should use maxTime, not maxDocs

Auto-commit documents after time interval
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-130) [Patch] [Docu] Starting a mySolr document, which tries to explain how to setup a custom solr instance

2007-01-31 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-130:
--

Component/s: documentation

 [Patch] [Docu] Starting a mySolr document, which tries to explain how to 
 setup a custom solr instance
 -

 Key: SOLR-130
 URL: https://issues.apache.org/jira/browse/SOLR-130
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Thorsten Scherler
 Attachments: SOLR-130.diff


 While developing a custom search server based on solr I took some notes about 
 the do's and don'ts. The initial patch is not a fully finished document but 
 may invite other devs to enhance it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

empty contentStream?


I'm trying to implement SOLR-85 using SOLR-104 content streams... but
it raises a simple behavior question.

If you have a form:
form
textarea  name=stream.body /textarea
input type=file name=file/
/form

If you upload a file, the update plugin is sent two content streams:
one with the contents of the file, the other with contents  .

As written the XmlUpdateHandler parses each stream and breaks when it
hits the empty string.

Options:
1. this should be implemented with two forms - every field sent should be used
2. if stream.body.trim().length() == 0, don't make a stream

I vote for #2, thoughts?

Re: loading many documents by ID



On Jan 31, 2007, at 6:39 PM, Chris Hostetter wrote:

: Oh, and there have been numerous people interested in updateable
: documents, so it would be nice if that part was in the update  
handler.


We'd have to make it very clear that this only works if all fields are
STORED.


That is perfectly reasonable, for sure.  And I would support an  
update feature issuing an exception if it detected this case.


There is an important caveat to all fields being stored though... if  
an update was sending in updated fields for all the non-stored  
fields, and only stored fields were being copied internally, all  
would be fine too.


I think eventually we could have this sort of feature internally copy  
the terms for non-stored fields somehow, but maybe that would only  
come along once Lucene supported something to facilitate this more?


Erik

Re: [jira] Created: (SOLR-131) tutorial update: faceting, highlighting, etc


What about putting the tutorial completely on the wiki?

We could pull the wiki page into a distribution to lock it in  
statically.


Just a thought.  I like it being off the wiki actually, but with the  
wiki anyone can lend a hand in wordsmithing and updating.


Erik


On Jan 31, 2007, at 9:31 PM, Yonik Seeley (JIRA) wrote:


tutorial update: faceting, highlighting, etc


 Key: SOLR-131
 URL: https://issues.apache.org/jira/browse/SOLR-131
 Project: Solr
  Issue Type: Improvement
  Components: documentation
Reporter: Yonik Seeley


The tutorial hasn't really been changed since we entered the  
incubator.  Highlighting and Faceting might be nice additions.


Looking back, I wish I had chosen a different data set like books  
or movies (or a mix of both)... something that wouldn't get out of  
date as fast as electronics, and that more people could identify  
with.  The biggest downside is examples in the Wiki refer to the  
current example docs.


breaking into multiple pages, and a screenshot or two wouldn't be  
bad idea either.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: empty contentStream?


On 1/31/07, Ryan McKinley [EMAIL PROTECTED] wrote:

Options:
1. this should be implemented with two forms - every field sent should be used
2. if stream.body.trim().length() == 0, don't make a stream

I vote for #2, thoughts?


Sigh... yes, it's practical.

-Yonmik

resin and UTF-8 in URLs


So, we've conquered UTF-8 input in URLs for Jetty and Tomcat, so how
about Resin?

Right now, I can't get Resin 3.0.22 to see an e with a circumflex via
the following:

curl -i 'http://localhost:8983/solr/select?q=%C3%AAechoParams=explicit'

-Yonik

[jira] Commented: (SOLR-85) [PATCH] Add update form to the admin screen


[ 
https://issues.apache.org/jira/browse/SOLR-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469325
 ] 

Ryan McKinley commented on SOLR-85:
---

Ok, this one is based on solar-85.with.file.upload.diff!

It also adds a few minor fixes / adjustments to SOLR-104

 [PATCH] Add update form to the admin screen
 ---

 Key: SOLR-85
 URL: https://issues.apache.org/jira/browse/SOLR-85
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Thorsten Scherler
 Attachments: solar-85.png, solar-85.png, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solar-85.with.file.upload.diff, solar-85.with.file.upload.diff, 
 solr-85-with-104.patch, solr-85-with-104.patch, solr-85.diff, solr-85.diff, 
 solr-85.FINAL.diff


 It would be nice to have a webform to update solr via a http interface 
 instead of using the post.sh.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: loading many documents by ID


On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote:


On Jan 31, 2007, at 6:39 PM, Chris Hostetter wrote:
 : Oh, and there have been numerous people interested in updateable
 : documents, so it would be nice if that part was in the update
 handler.

 We'd have to make it very clear that this only works if all fields are
 STORED.

That is perfectly reasonable, for sure.  And I would support an
update feature issuing an exception if it detected this case.

There is an important caveat to all fields being stored though... if
an update was sending in updated fields for all the non-stored
fields, and only stored fields were being copied internally, all
would be fine too.


I think there might be two useful types of updates:
1) overwrite original field
2) add an additional value for a multi-valued field (useful for tagging?)



I think eventually we could have this sort of feature internally copy
the terms for non-stored fields somehow, but maybe that would only
come along once Lucene supported something to facilitate this more?


Not unless you store more info (a lot more info).
We sould also be able to copy unstored fields with term vectors stored.

ParallelReader might also hold some promise (putting a field to be
updated in a separate index)  The problem is that the lucene ids need
to be kept in sync... I don't know how to do that w/o reindexing.

-Yonik

Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa


TODO: switch solrb to using wt=json instead of wt=ruby.

Whatcha think, Ed et al?

Erik


On Jan 30, 2007, at 1:36 PM, [EMAIL PROTECTED] wrote:


Author: yonik
Date: Tue Jan 30 10:36:32 2007
New Revision: 501512

URL: http://svn.apache.org/viewvc?view=revrev=501512
Log:
SimpleOrderedMap, JSON named list changes: SOLR-125

Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa


On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote:

TODO: switch solrb to using wt=json instead of wt=ruby.


Why is that?

-Yonik

charset in POST from browser


It seems that browsers do a form POST in the charset that the page was
encoded in.
Modifying form.jsp in solr/admin seems to work... the data comes
across encoded in UTF8.

The problem is that the charset isn't defined to be UTF-8 in the
headers, so the bytes are assumed to be latin-1.

Is this a problem we can fix in solr, or is it purely container config?

This will mimic what the browser sends back:
curl -i http://localhost:8983/solr/select -d 'q=%C3%AA'

-Yonik

Re: loading many documents by ID

2007-01-31 Thread Walter Underwood

On 1/31/07 3:39 PM, Chris Hostetter [EMAIL PROTECTED] wrote:
 
 : Oh, and there have been numerous people interested in updateable
 : documents, so it would be nice if that part was in the update handler.
 
 We'd have to make it very clear that this only works if all fields are
 STORED.

Isn't there some way to do this automatically instead of relying
on documentation? We might need to add something, maybe a
required attribute on fields, but a runtime error would be
much, much better than a page on the wiki.

wunder

Re: loading many documents by ID



 We'd have to make it very clear that this only works if all fields are
 STORED.

Isn't there some way to do this automatically instead of relying
on documentation? We might need to add something, maybe a
required attribute on fields, but a runtime error would be
much, much better than a page on the wiki.



what about copyField?

With copyField, it is reasonable to have fields that are not stored
and are generated from the other stored fields.  (this is what my
setup looks like)

[jira] Closed: (SOLR-129) Solrb - UTF 8 Support for add/delete

2007-01-31 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher closed SOLR-129.
-

Resolution: Cannot Reproduce

I added a controller and view to display the features from the 
utf8-example.xml file to flare.

1) fire up the Solr example application, and post.sh *.xml from the 
exampledocs directory.

2) fire up flare, hit /i18n (http://localhost:3000/i18n

Showing all the accented characters worked fine for me.

I suspect we probably still have some i18n issues to iron out, so any help or 
at least test cases in that regard would be most helpful.

 Solrb - UTF 8 Support for add/delete
 

 Key: SOLR-129
 URL: https://issues.apache.org/jira/browse/SOLR-129
 Project: Solr
  Issue Type: Bug
  Components: clients - ruby - flare
 Environment: OSX
Reporter: Antonio Eggberg

 Hi:
 This could be a ruby utf-8 bug. Anyway when I try to do a UTF-8 document add 
 via post.sh and then do query via Solr Admin everything works as it should. 
 However using the solrb ruby lib or flare UTF-8 doc add doesn't work as it 
 should. I am not sure what I am doing wrong and I don't think its Solr cos it 
 works as it should.
 Could this be a famous utf-8 ruby bug? I am using ruby 1.8.5 with rails 1.2.1
 Cheers

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: loading many documents by ID

2007-01-31 Thread Walter Underwood

On 1/31/07 9:05 PM, Ryan McKinley [EMAIL PROTECTED] wrote:
 
 We'd have to make it very clear that this only works if all fields are
 STORED.
 
 Isn't there some way to do this automatically instead of relying
 on documentation? We might need to add something, maybe a
 required attribute on fields, but a runtime error would be
 much, much better than a page on the wiki.
 
 what about copyField?
 
 With copyField, it is reasonable to have fields that are not stored
 and are generated from the other stored fields.  (this is what my
 setup looks like).

Mine, too. That is why I suggested explicit declarations in the
schema to say which fields are required.

wunder

Re: empty contentStream?

2007-01-31 Thread Chris Hostetter


:  1. this should be implemented with two forms - every field sent should be 
used
:  2. if stream.body.trim().length() == 0, don't make a stream
: 
:  I vote for #2, thoughts?
:
: Sigh... yes, it's practical.

Alternate Idea #3: make the XmlUpdateRequestHandler more robust in
recieving empty streams (treat it as a NOOP, maybe return an error if
*all* the streams are empty)

i'm okay with #2 as long as it's only in the stream.body parsing and not
something we try to do with every stream.



-Hoss

Re: charset in POST from browser


On 2/1/07, Chris Hostetter [EMAIL PROTECTED] wrote:

: The problem is that the charset isn't defined to be UTF-8 in the
: headers, so the bytes are assumed to be latin-1.
:
: Is this a problem we can fix in solr, or is it purely container config?

umm... we already fixed this the best way i know how in SOLR-35 ... all of
the JSPs that have forms should have this in them...

%@ page contentType=text/html; charset=utf-8 pageEncoding=UTF-8%

...is resin not respecting that?


The form that gets sent to the browser is in UTF8, and the browser
correctly sends back UTF8 in the post body.  *But* the browser doesn't
tell the container what the charset of the body is, so it's up to the
container to guess.  By default, resin seems to pick latin-1.

It seems like we should assume UTF-8 if no charset is sent for a text
content type.

-Yonik

Re: resin and UTF-8 in URLs


I just tried this on two systems... it worked on one (I got the ê) and
the other I get Ãª -- both running resin 3.0.21

The one that works has http://securityfilter.sourceforge.net/ applied.
I'll look into what securityfilter is doing... it may be setting
something explicitly

Re: empty contentStream?


I just posted SOLR-85 using strategy #2.

It makes sure stream.body and stream.url have content before making
streams out of them.  I think this makes sense given they are likely
to be used in forms similar to the 'update.jsp' where they may or may
not have content.



i'm okay with #2 as long as it's only in the stream.body parsing and not
something we try to do with every stream.



I totally agree it should not check 'real' streams, but these are
essentially helper streams that make it easy to post a stream from a
form.

Re: resin and UTF-8 in URLs


On 2/1/07, Ryan McKinley [EMAIL PROTECTED] wrote:

I just tried this on two systems... it worked on one (I got the ê) and
the other I get Ãª -- both running resin 3.0.21


A co-worker informed me that adding a character-encoding attribute to
the web-app tag in web.xml will force a charset if not defined.  Seems
to work for both GET and POST.

web-app character-encoding=utf-8

This looks resin-specific though.

-Yonik

Re: charset in POST from browser

2007-01-31 Thread Chris Hostetter


: The form that gets sent to the browser is in UTF8, and the browser
: correctly sends back UTF8 in the post body.  *But* the browser doesn't
: tell the container what the charset of the body is, so it's up to the
: container to guess.  By default, resin seems to pick latin-1.

That's really weird ... i could have sworn browsers doing POST of form
data were suppose to sent a full content-type...

   Content-type: application/x-www-form-urlencoded; charset=utf-8

...picking the charset based on the charset of the page containing the
form  (i assume you tested and verified this isn't happening?)

a quick google search turned up this page, with this info...

http://www.systemvikar.biz/faq/servlet.xtp



Form character encoding doesn't work

A POST request with application/x-www-form-urlencoded doesn't contain any
information about the character request. So Resin needs to use a set of
heuristics to decode the form. Here's the order:

   1. request.getAttribute(caucho.form.character.encoding)
   2. The response.setContentType() encoding of the page.
   3. The character-encoding tag in the resin.conf.

Resin uses the default character encoding of your JVM to read form data.
To set the encoding to another charset, you'll need to change the
resin.conf as follows:

http-server character-encoding='Shift_JIS'
  ...
/http-server

Re: empty contentStream?

2007-01-31 Thread Chris Hostetter


: It makes sure stream.body and stream.url have content before making
: streams out of them.  I think this makes sense given they are likely
: to be used in forms similar to the 'update.jsp' where they may or may
: not have content.

yeah ... good call.


-Hoss

Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa



On Jan 31, 2007, at 11:08 PM, Yonik Seeley wrote:


On 1/31/07, Erik Hatcher [EMAIL PROTECTED] wrote:

TODO: switch solrb to using wt=json instead of wt=ruby.


Why is that?


To benefit from a richer data structure, avoid eval (which I hear is  
likely to be slower than parsing JSON, and eval is potentially more  
dangerous if code somehow got slipped in though that risk is not very  
high).


The downside is that we'd need to add a dependency on a JSON parsing  
library.  JSON is close enough to Ruby syntax that it can practically  
be eval'd, interestingly, but I don't think it's close enough.


Erik

Re: charset in POST from browser


On 2/1/07, Chris Hostetter [EMAIL PROTECTED] wrote:

: The form that gets sent to the browser is in UTF8, and the browser
: correctly sends back UTF8 in the post body.  *But* the browser doesn't
: tell the container what the charset of the body is, so it's up to the
: container to guess.  By default, resin seems to pick latin-1.

That's really weird ... i could have sworn browsers doing POST of form
data were suppose to sent a full content-type...

   Content-type: application/x-www-form-urlencoded; charset=utf-8

...picking the charset based on the charset of the page containing the
form  (i assume you tested and verified this isn't happening?)


Yep, FireFox2.
I'd serve the page, do a search, kill the solr server, run nc -l -p
8983, and run the search again.  The body was encoded correctly, but
just no charset info.

I tried setting it explicitly by appending to enctype in the form, but
it doesn't go through.

-Yonik

Re: svn commit: r501512 - in /lucene/solr/trunk: ./ src/java/org/apache/solr/core/ src/java/org/apache/solr/handler/ src/java/org/apache/solr/request/ src/java/org/apache/solr/search/ src/java/org/apa