[jira] Updated: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-07-24 Thread Toru Matsuzawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toru Matsuzawa updated SOLR-255:


Attachment: solr-multi20070724..zip

Pache Updated.

The following changes.
1) Addition of easy TestUnit.
2) search for shards in parallel.
3) Cash(like FieldCache) is added to getInts(filed),getStringIndex(filed) etc.
4) Bug fix of sort asc.

Standalone/solr-multi20070724-NoRMI.patch is a patch for local index searches 
only.
Please use the patch of RMI/lucene4solr-multi.patch and 
RMI/solr-multi20070724.patch when you use RMI. 
(lucene4solr-multi.patch is patch for lucene. )


 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip, solr-multi20070724..zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-286) Japanese document is garbled by using solr java client.

2007-07-03 Thread Toru Matsuzawa (JIRA)
Japanese document is garbled by using solr java client. 


 Key: SOLR-286
 URL: https://issues.apache.org/jira/browse/SOLR-286
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.3
Reporter: Toru Matsuzawa


Japanese document is garbled by using solr java client. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-286) Japanese document is garbled by using solr java client.

2007-07-03 Thread Toru Matsuzawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toru Matsuzawa updated SOLR-286:


Attachment: ContentStreamBase.patch

patch attached

 Japanese document is garbled by using solr java client. 
 

 Key: SOLR-286
 URL: https://issues.apache.org/jira/browse/SOLR-286
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: ContentStreamBase.patch


 Japanese document is garbled by using solr java client. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-26 Thread Toru Matsuzawa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508380
 ] 

Toru Matsuzawa commented on SOLR-255:
-

Hi Otis  Henri,

Otis Gospodnetic wrote:
 So with your patch one can search any *one* of those indices,
 or any *group* of those indices, 
 correct?  In the case where a *group* of indices is searched,
 do you search them in parallel and merge the results?

With my patch one can search a group of these indeces.
Each index in the group is searched in sequence, 
and then each result is merged.

Henri Biestro wrote:
 I've been looking quickly at your patch and 
 kinda understands why Otis is pushing for a merge. :-)
 I dont know how this is usually done; 
 should we merge the 2 issues and merge our patches?
 I can try  see how this goes if you want.

I inspected the patch of SOLR-215. 
The overlaps between SOLR-215 and SOLR255 are 
in the constructor of SolrIndexSearcher and SolrCore.
Each modification should be committed sequentially.
After that, there are not many additional modifications.

The commitment should be done through some stages. 
(It might be acceptable Step1 and Step2 is in reverse order. Or, simultaneous? 
) 
Step1) MultiCore (SOLR-215) 
Step2) The functionality of MultiSearcher, exclude modification of RMI and 
Lucene.
   (SolrMultiSearcher and SolrIndexSearchable) 
Step3) The modification of Lucene
Step4) The functional addition to the RMI (SolrRemoteSearcher) 
   (When it becomes MultiCore, additional modification, in which 
the remote object of RMI should be created dynamically, will be needed.)

 One thing that worries me though is the Lucene patch dependency; 
 any way to only have a Solr patch?
 I would suspect that Lucene committers are as busy as Solr 's 
 so the review process might take sometime.
 Although from far, it does look like pretty harmless changes so there is 
 hope...

The RMI (SolrRemoteSearcher) causes the Lucene patch dependency.
There will be no impact on SOLR-215 by the above-mentioned procedure.

 As a side note, I was wondering if we could extend 
 you patch's functionality and get read/write capability per index
 (as in http://hellonline.com/blog/?p=55 ,
 document indexing load balancing could be performed 
 on hashing unique key % number of indexes for instance 
 or by some configurable class). 
 The current functionality would be retained 
 by specificying 'read-only' versus 'read-write' for each index.

I also have ideas about this but those are not concrete enough.
Anyway, that will be done through Step5 and later.

Thanks.


 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Toru Matsuzawa (JIRA)
Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
-

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa


Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent

It is necessary to correspond with XmlUpdateRequestHandler because xpp3 cannot 
use !DOCTYPE.
I think it is necessary until StaxUpdateRequestHandler becomes /update.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Toru Matsuzawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toru Matsuzawa updated SOLR-277:


Attachment: XmlUpdateRequestHandler.patch

patch attached.

 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 -

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: XmlUpdateRequestHandler.patch


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
 It is necessary to correspond with XmlUpdateRequestHandler because xpp3 
 cannot use !DOCTYPE.
 I think it is necessary until StaxUpdateRequestHandler becomes /update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .

2007-06-26 Thread Toru Matsuzawa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508421
 ] 

Toru Matsuzawa commented on SOLR-277:
-

Hi Walter,
It is understood that it is not a bug. 
And, it is understood that the longevity of this patch is short.

I thought that you may support general entities, and gave this patch. 
Because it was thought that it was used easily more for the user. 

It seemed to follow the specification of xpp3. 
(Only Basic latin(quat; amp; lt; gt; apos;) is supported by current state 
xpp3.)

This issue closes if it is a specification that Solr XML format doesn't support 
Character Entities of XHTML. 

Thanks,


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 -

 Key: SOLR-277
 URL: https://issues.apache.org/jira/browse/SOLR-277
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Toru Matsuzawa
 Attachments: XmlUpdateRequestHandler.patch


 Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
 http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
 http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
 It is necessary to correspond with XmlUpdateRequestHandler because xpp3 
 cannot use !DOCTYPE.
 I think it is necessary until StaxUpdateRequestHandler becomes /update.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-15 Thread Toru Matsuzawa (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505078
 ] 

Toru Matsuzawa commented on SOLR-255:
-

My implementation is as follows.

client -- Solr instance  +/data/index
   +/local/other/solr1/data/index
   +/local/other/solr2/data/index
   +/local/other/solr3/data/index
|
   +--- RMI --- Remote Solr 
index(/data/index)
|
   +--- RMI --- Remote Solr 
index(/data/index)

First of all, I'd like to get your comments whether my modification is 
beneficial for Solr.
If it is helpful, is it possible to implement this changes without modifying 
Lucene? 
I need your comments on this point.(If you have any other ideas for the 
implementation, please let me know.)

After it becomes clear the necessity of modifying Lucene, I'd like to post it 
to Lucene's JIRA.

I understand there are some overlaps with SOLR-215.
However, I think this modification can coexist with SOLR-215.

Thank you.



 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-06 Thread Toru Matsuzawa (JIRA)
RemoteSearchable for Solr(use RMI)
--

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa


I experimentally implemented RemoteSearchable of Lucene for Solr.
I referred to FederatedSearch and used RMI. 

Two or more Searchers can be referred to with SolrIndexSearcher.
These query-only indexes can be specified in solrconfig.xml, 
enumerating the list under a searchIndex tag.

  searchIndex
lstE:\sample\data1/lst
lstE:\sample\data2/lst
lstrmi://localhost/lst
  /searchIndex

The index in the dataDir is also used as the default index of solr
to update and query.

When data of a document in a index specified under the searchIndex is
updated, 
that document data in the index will be deleted and data of the updated 
document will be stored
in the index in the dataDir.

SolrRemoteSearchable (the searcher for remote access) is started from SolrCore 
by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
is registered in RMI. )
(-Djava.security.policy should be set when you start VM. )

Not all of the operational cases are tested 
because Solr has so many features. 

Moreover, TestUnit has not been made 
because I made this through a trial and error process. 
Some changes are required in Lucene to execute this. 

I need your comments on this although it might be hard without TestUnit. 
I especially worry about the followings: 
- Am I on the right truck about this issue?
- Is the extent of modifying Lucene tolerable?
- Are there any ideas to implement this feature without modifying Lucene?
- Does this idea contribute for improving Solr?
- This implementation may partially overlap with Multiple Solr Cores.
  What should be done?
- Are there any other considerations about this issue, which I have overlooked?


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-255) RemoteSearchable for Solr(use RMI)

2007-06-06 Thread Toru Matsuzawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toru Matsuzawa updated SOLR-255:


Attachment: solr-multi20070606.zip

I attached the patch for Lucene, the patch for Solr 
and solrconfig.xml to execute this.

My response could be very slow because of my poor English ability. 

 RemoteSearchable for Solr(use RMI)
 --

 Key: SOLR-255
 URL: https://issues.apache.org/jira/browse/SOLR-255
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Toru Matsuzawa
 Attachments: solr-multi20070606.zip


 I experimentally implemented RemoteSearchable of Lucene for Solr.
 I referred to FederatedSearch and used RMI. 
 Two or more Searchers can be referred to with SolrIndexSearcher.
 These query-only indexes can be specified in solrconfig.xml, 
 enumerating the list under a searchIndex tag.
   searchIndex
 lstE:\sample\data1/lst
 lstE:\sample\data2/lst
 lstrmi://localhost/lst
   /searchIndex
 The index in the dataDir is also used as the default index of solr
 to update and query.
 When data of a document in a index specified under the searchIndex is
 updated, 
 that document data in the index will be deleted and data of the updated 
 document will be stored
 in the index in the dataDir.
 SolrRemoteSearchable (the searcher for remote access) is started from 
 SolrCore 
 by specifying  remoteSearchertrue/remoteSearcher  in solrconfig.xml.(It 
 is registered in RMI. )
 (-Djava.security.policy should be set when you start VM. )
 Not all of the operational cases are tested 
 because Solr has so many features. 
 Moreover, TestUnit has not been made 
 because I made this through a trial and error process. 
 Some changes are required in Lucene to execute this. 
 I need your comments on this although it might be hard without TestUnit. 
 I especially worry about the followings: 
 - Am I on the right truck about this issue?
 - Is the extent of modifying Lucene tolerable?
 - Are there any ideas to implement this feature without modifying Lucene?
 - Does this idea contribute for improving Solr?
 - This implementation may partially overlap with Multiple Solr Cores.
   What should be done?
 - Are there any other considerations about this issue, which I have 
 overlooked?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.