date:20090916

[jira] Updated: (SOLR-1441) Make it possible to run all tests in a package

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1441:


Attachment: SOLR-1441.patch

Copied from Shai's patches in LUCENE-1617

We should extract out the junit task into a macro and share it across all 
contrib builds. Right now I'm adding this only for core because that has the 
bulk of unit tests.

> Make it possible to run all tests in a package
> --
>
> Key: SOLR-1441
> URL: https://issues.apache.org/jira/browse/SOLR-1441
> Project: Solr
>  Issue Type: Improvement
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Attachments: SOLR-1441.patch
>
>
> Adding the following properties to junit target in build.xml
> # ant -Dtestcase - for a single test class
> # ant -Dtestpackage - for all classes in a package, including sub-packages
> # and -Dtestpackageroot - for all classes in a package, without sub-packages

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1441) Make it possible to run all tests in a package

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)

Make it possible to run all tests in a package
--

 Key: SOLR-1441
 URL: https://issues.apache.org/jira/browse/SOLR-1441
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial


Adding the following properties to junit target in build.xml

# ant -Dtestcase - for a single test class
# ant -Dtestpackage - for all classes in a package, including sub-packages
# and -Dtestpackageroot - for all classes in a package, without sub-packages


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Solr API

2009-09-16 Thread Asish Kumar Mohanty

Hi,

I am writing this code but still not getting the command properly.

SolrServer solr = new
CommonsHttpSolrServer("http://localhost:8080/solr/db";);



SolrQuery q = new
SolrQuery().setParam("qt","/dataimport").setParam("command","full-import");


QueryResponse response = solr.query(q);

System.out.println("*response is** " + response);

}

- Original Message - 
From: "Noble Paul നോബിള്‍ नोब्ळ्" 
To: 
Sent: Tuesday, September 15, 2009 12:01 PM
Subject: Re: Solr API


SolrQuery q = new
SolrQuery().setParam("qt","/dataimport").setParam("command",
"full-import");
solrServer.query(q);

On Tue, Sep 15, 2009 at 11:32 AM, Asish Kumar Mohanty
 wrote:
> Hi Sir,
> still facing problem..
>
> i cannot understand how to provide the command
> http://localhost:8983/solr/db/dataimport?command=full-import..
>
> can anybody plz help me out???
> - Original Message -
> From: "Noble Paul നോബിള്‍ नोब्ळ्" 
> To: 
> Sent: Monday, September 14, 2009 5:26 PM
> Subject: Re: Solr API
>
>
> SolrJ can be used to make any name value request to Solr.
> use the SolrQuery#set(name,val)
>
>
>
> On Mon, Sep 14, 2009 at 4:47 PM, Asish Kumar Mohanty
>  wrote:
>> Yes Sir..
>>
>> SolrJ API...
>>
>>
>>
>> Regards
>> Asish
>>
>> - Original Message -
>> From: "Noble Paul നോബിള്‍ नोब्ळ्" 
>> To: 
>> Sent: Monday, September 14, 2009 4:40 PM
>> Subject: Re: Solr API
>>
>>
>>> did you mean SolrJ API?
>>>
>>> On Mon, Sep 14, 2009 at 4:15 PM, Asish Kumar Mohanty
>>>  wrote:
>>> > Hi,
>>> >
>>> > I just want to write a Solr API for full-import. Can anybody please
> help
>> me
>>> > out???
>>> >
>>> > It's very urgent.
>>> >
>>> > Regards
>>> > Asish
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> -
>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>
>>
>>
>>
>
>
>
> --
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
>
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

[jira] Commented: (SOLR-1437) DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.

2009-09-16 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756358#action_12756358
 ] 

Noble Paul commented on SOLR-1437:
--

for any normal event , parser.next(); should be called in each iteration. But 
for CDATA it should not do so because handling of CDATA itself would have 
consumed the next event

> DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.
> 
>
> Key: SOLR-1437
> URL: https://issues.apache.org/jira/browse/SOLR-1437
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Fergus McMenemie
>Priority: Minor
> Fix For: 1.5
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> As per 
> http://www.nabble.com/Re%3A-Extract-info-from-parent-node-during-data-import-%28redirect%3A%29-td25471162.html
>  it would be nice to be able to use expressions such as //tagname when 
> parsing XML documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1440) DIH:LineEntityprocessor does not reinitialize the reader after init

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-1440.
--

Resolution: Fixed

committed r816042

> DIH:LineEntityprocessor does not reinitialize the reader after init
> ---
>
> Key: SOLR-1440
> URL: https://issues.apache.org/jira/browse/SOLR-1440
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1440.patch
>
>
> instead of just closing the reader it should also be set to null;
> see the mail thread 
> http://www.nabble.com/FileListEntityProcessor-and-LineEntityProcessor-to25476443.html#a25476443

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1440) DIH:LineEntityprocessor does not reinitialize the reader after init

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1440:
-

Attachment: SOLR-1440.patch

> DIH:LineEntityprocessor does not reinitialize the reader after init
> ---
>
> Key: SOLR-1440
> URL: https://issues.apache.org/jira/browse/SOLR-1440
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1440.patch
>
>
> instead of just closing the reader it should also be set to null;
> see the mail thread 
> http://www.nabble.com/FileListEntityProcessor-and-LineEntityProcessor-to25476443.html#a25476443

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1440) DIH:LineEntityprocessor does not reinitialize the reader after init

2009-09-16 Thread Noble Paul (JIRA)

DIH:LineEntityprocessor does not reinitialize the reader after init
---

 Key: SOLR-1440
 URL: https://issues.apache.org/jira/browse/SOLR-1440
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.4


instead of just closing the reader it should also be set to null;

see the mail thread 
http://www.nabble.com/FileListEntityProcessor-and-LineEntityProcessor-to25476443.html#a25476443

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1439) Enhance PollInterval for Java Replication

2009-09-16 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756352#action_12756352
 ] 

Noble Paul commented on SOLR-1439:
--

isn't it same as SOLR-1431?

> Enhance PollInterval for Java Replication
> -
>
> Key: SOLR-1439
> URL: https://issues.apache.org/jira/browse/SOLR-1439
> Project: Solr
>  Issue Type: New Feature
>  Components: replication (java)
> Environment: ALL
>Reporter: Bill Bell
> Fix For: 1.4
>
>
> I am not a huge fan of PollInterval. It would be great to add an option to 
> get the Index based on exact time: PollTime="*/15 * * * *" That would run at 
> every 15 minutes based on the clock. i.e. 1:00pm, 1:15pm, 1:30pm, 1:45pm, 
> etc. All my slaves are sync'd using NTP, so this would work better. Since 
> each slave starts differently, we cannot set the PollInterval="00:15:00" 
> since they would get different indexes based on when they start. The other 
> option would be to suspend polling - and start - which would be very manual I 
> guess. Setting the PollInterval to 10 seconds would be getting a new index 
> when the old one is still warming up. Even 10 seconds interval would not be 
> good, since we get so many updates, each server would have different indexes. 
> With Snap we don't have this issue.
> We get SOLR updates frequently and since they are large we cannot wait to do 
> a commit at the 15 minute mark using cron. Optimize just takes too long.
> On our system we need to limit how often the slaves get the new index. We 
> would like all slaves to get the index at the same time.
> From Noble Paul:
> The default pollInterval can behave the way you want (so that the fetches are 
> synchronized in time by the clock). Raise a separate issue and we can fix it

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1439) Enhance PollInterval for Java Replication

2009-09-16 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756352#action_12756352
 ] 

Noble Paul edited comment on SOLR-1439 at 9/16/09 8:41 PM:
---

isn't it same as SOLR-1435?

  was (Author: noble.paul):
isn't it same as SOLR-1431?
  
> Enhance PollInterval for Java Replication
> -
>
> Key: SOLR-1439
> URL: https://issues.apache.org/jira/browse/SOLR-1439
> Project: Solr
>  Issue Type: New Feature
>  Components: replication (java)
> Environment: ALL
>Reporter: Bill Bell
> Fix For: 1.4
>
>
> I am not a huge fan of PollInterval. It would be great to add an option to 
> get the Index based on exact time: PollTime="*/15 * * * *" That would run at 
> every 15 minutes based on the clock. i.e. 1:00pm, 1:15pm, 1:30pm, 1:45pm, 
> etc. All my slaves are sync'd using NTP, so this would work better. Since 
> each slave starts differently, we cannot set the PollInterval="00:15:00" 
> since they would get different indexes based on when they start. The other 
> option would be to suspend polling - and start - which would be very manual I 
> guess. Setting the PollInterval to 10 seconds would be getting a new index 
> when the old one is still warming up. Even 10 seconds interval would not be 
> good, since we get so many updates, each server would have different indexes. 
> With Snap we don't have this issue.
> We get SOLR updates frequently and since they are large we cannot wait to do 
> a commit at the 15 minute mark using cron. Optimize just takes too long.
> On our system we need to limit how often the slaves get the new index. We 
> would like all slaves to get the index at the same time.
> From Noble Paul:
> The default pollInterval can behave the way you want (so that the fetches are 
> synchronized in time by the clock). Raise a separate issue and we can fix it

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1439) Enhance PollInterval for Java Replication

2009-09-16 Thread Bill Bell (JIRA)

Enhance PollInterval for Java Replication
-

 Key: SOLR-1439
 URL: https://issues.apache.org/jira/browse/SOLR-1439
 Project: Solr
  Issue Type: New Feature
  Components: replication (java)
 Environment: ALL
Reporter: Bill Bell
 Fix For: 1.4


I am not a huge fan of PollInterval. It would be great to add an option to get 
the Index based on exact time: PollTime="*/15 * * * *" That would run at every 
15 minutes based on the clock. i.e. 1:00pm, 1:15pm, 1:30pm, 1:45pm, etc. All my 
slaves are sync'd using NTP, so this would work better. Since each slave starts 
differently, we cannot set the PollInterval="00:15:00" since they would get 
different indexes based on when they start. The other option would be to 
suspend polling - and start - which would be very manual I guess. Setting the 
PollInterval to 10 seconds would be getting a new index when the old one is 
still warming up. Even 10 seconds interval would not be good, since we get so 
many updates, each server would have different indexes. With Snap we don't have 
this issue.

We get SOLR updates frequently and since they are large we cannot wait to do a 
commit at the 15 minute mark using cron. Optimize just takes too long.

On our system we need to limit how often the slaves get the new index. We would 
like all slaves to get the index at the same time.

>From Noble Paul:
The default pollInterval can behave the way you want (so that the fetches are 
synchronized in time by the clock). Raise a separate issue and we can fix it


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1432) FunctionQueries aren't correctly weighted

2009-09-16 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1432:
---

Attachment: SOLR-1432.patch

Updated patch with tests that fail w/o correct weighting behavior.

> FunctionQueries aren't correctly weighted
> -
>
> Key: SOLR-1432
> URL: https://issues.apache.org/jira/browse/SOLR-1432
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 1.4
>
> Attachments: SOLR-1432.patch, SOLR-1432.patch
>
>
> Nested queries in function queries aren't weighted correctly with the proper 
> Searcher, and this is now even more serious with per-segment searching in 
> Lucene/Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-284) Parsing Rich Document Types

2009-09-16 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756266#action_12756266
 ] 

Yonik Seeley commented on SOLR-284:
---

bq. example solrconfig.xml at the head of SVN trunk still uses map, not fmap.

Thanks, I just fixed this.

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, schema_update.patch, 
> SOLR-284-no-key-gen.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, solr-word.pdf, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-284) Parsing Rich Document Types

2009-09-16 Thread Chris Harris (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756259#action_12756259
 ] 

Chris Harris commented on SOLR-284:
---

Grant and company: I just noticed that the example solrconfig.xml at the head 
of SVN trunk still uses map, not fmap. (In particular, there's "map.content", 
"map.a", and "map.div".) I assume this should be fixed for the 1.4 release. 
Interestingly, this doesn't seem to make any unit tests fail.

> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, schema_update.patch, 
> SOLR-284-no-key-gen.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, solr-word.pdf, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1438) Timeout distributed query stage get fields

2009-09-16 Thread Jason Rutherglen (JIRA)

Timeout distributed query stage get fields
--

 Key: SOLR-1438
 URL: https://issues.apache.org/jira/browse/SOLR-1438
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5


In a distributed query, timeouts work for PURPOSE_GET_TOP_IDS
but we need them for PURPOSE_GET_FIELDS (obtaining the document
data). We'll reuse the timeAllowed parameter and pass it to the
shards during the get fields distributed request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1336) Add support for lucene's SmartChineseAnalyzer

2009-09-16 Thread Stanislaw Osinski (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756177#action_12756177
 ] 

Stanislaw Osinski commented on SOLR-1336:
-

Keeping the Chinese analyzer JAR optional sounds good. As Carrot2 also uses it, 
I'd need to make sure the clustering contrib doesn't fail when the JAR is not 
there and clustering in Chinese is requested (I think I'd simply log a WARN 
saying that the Chinese analyzer JAR is required for best clustering results).

> Add support for lucene's SmartChineseAnalyzer
> -
>
> Key: SOLR-1336
> URL: https://issues.apache.org/jira/browse/SOLR-1336
> Project: Solr
>  Issue Type: New Feature
>  Components: Analysis
>Reporter: Robert Muir
> Attachments: SOLR-1336.patch, SOLR-1336.patch, SOLR-1336.patch
>
>
> SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese 
> text as words.
> if the factories for the tokenizer and word token filter are added to solr it 
> can be used, although there should be a sample config or wiki entry showing 
> how to apply the built-in stopwords list.
> this is because it doesn't contain actual stopwords, but must be used to 
> prevent indexing punctuation... 
> note: we did some refactoring/cleanup on this analyzer recently, so it would 
> be much easier to do this after the next lucene update.
> it has also been moved out of -analyzers.jar due to size, and now builds in 
> its own smartcn jar file, so that would need to be added if this feature is 
> desired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-284) Parsing Rich Document Types

2009-09-16 Thread Chris Harris (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756154#action_12756154
 ] 

Chris Harris commented on SOLR-284:
---

This caught me by surprise, so I'm noting it here in case it helps anyone else:

In SVN r815830 (September 16, 2009), Grant renamed the field name mapping 
argument "map" to "fmap". The reason was to make naming more consistent with 
the CSV handler. For more info on this see the following thread:

http://www.nabble.com/Fwd%3A-CSV-Update---Need-help-mapping-csv-field-to-schema%27s-ID-td25463942.html



> Parsing Rich Document Types
> ---
>
> Key: SOLR-284
> URL: https://issues.apache.org/jira/browse/SOLR-284
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Eric Pugh
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
> Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
> rich.patch, rich.patch, rich.patch, rich.patch, schema_update.patch, 
> SOLR-284-no-key-gen.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, SOLR-284.patch, 
> SOLR-284.patch, SOLR-284.patch, solr-word.pdf, source.zip, test-files.zip, 
> test-files.zip, test.zip, un-hardcode-id.diff
>
>
> I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
> that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
> Solr.
> There is a wiki page with information here: 
> http://wiki.apache.org/solr/UpdateRichDocuments
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1316) Create autosuggest component

2009-09-16 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756149#action_12756149
 ] 

Andrzej Bialecki  commented on SOLR-1316:
-

bq. Andrej, why would immutability be a problem? Wouldn't we have to re-build 
the TST if the source index changes?

Well, the use case I have in mind is a TST that improves itself over time based 
on the observed query log. I.e. you would bootstrap a TST from the index (and 
here indeed you can do this on every searcher refresh), but it's often claimed 
that real query logs provide a far better source of autocomplete than the index 
terms. My idea was to start with what you have - in the absence of query logs - 
and then improve upon it by adding successful queries (and removing least-used 
terms to keep the tree at a more or less constant size).

Alternatively we could provide an option to bootstrap it from a real query log 
data.

This use case requires mutability, hence my negative opinion about DAGWs 
(besides, we are lacking an implementation, don't we, whereas we already have a 
few suitable TST implementations). Perhaps this doesn't have to be an 
either/or, if we come up with a pluggable interface for this type of component?

bq. I think the building of the data structure can be done in a way similar to 
what SpellCheckComponent does. [..]

+1


> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 1.5
>
> Attachments: TernarySearchTree.tar.gz
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1314) Upgrade Carrot2 to version 3.1.0

2009-09-16 Thread Stanislaw Osinski (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756110#action_12756110
 ] 

Stanislaw Osinski commented on SOLR-1314:
-

Hi Grant,

I've just dropped the patenting clause entirely. The updated license is in the 
repo and at: http://www.carrot2.org/carrot2.LICENSE.

S.

> Upgrade Carrot2 to version 3.1.0
> 
>
> Key: SOLR-1314
> URL: https://issues.apache.org/jira/browse/SOLR-1314
> Project: Solr
>  Issue Type: Task
>Reporter: Stanislaw Osinski
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
>
> As soon as Lucene 2.9 is releases, Carrot2 3.1.0 will come out with bug fixes 
> in clustering algorithms and improved clustering in Chinese. The upgrade 
> should be a matter of upgrading {{carrot2-mini.jar}} and 
> {{google-collections.jar}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1316) Create autosuggest component

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756094#action_12756094
 ] 

Shalin Shekhar Mangar commented on SOLR-1316:
-

bq. DAWGs are problematic, because they are essentially immutable once created 
(the cost of insert / delete is very high)

Andrej, why would immutability be a problem? Wouldn't we have to re-build the 
TST if the source index changes?

bq. Also, I think that populating TST from the index would have to be 
discriminative, perhaps based on a threshold

I think the building of the data structure can be done in a way similar to what 
SpellCheckComponent does. We can re-use the HighFrequencyDictionary which can 
give tokens above a certain threshold frequency. The field names to use for 
building the data structure and the analysis can also be done like SCC. The 
response format for this component can also be similar to SCC.

> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 1.5
>
> Attachments: TernarySearchTree.tar.gz
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1407) SpellingQueryConverter now disallows underscores and digits in field names (but allows all UTF-8 letters)

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1407.
-

Resolution: Fixed

Committed revision 815801.

Thanks David & Michael!

> SpellingQueryConverter now disallows underscores and digits in field names 
> (but allows all UTF-8 letters)
> -
>
> Key: SOLR-1407
> URL: https://issues.apache.org/jira/browse/SOLR-1407
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: David Bowen
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1407.patch, SOLR-1407.patch, 
> SpellingQueryConverter.java, SpellingQueryConverter.java
>
>
> SpellingQueryConverter was extended to cover the full UTF-8 range instead of 
> handling US-ASCII only, but in the process it was broken for field names that 
> contain underscores or digits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: CSV Update - Need help mapping csv field to schema's ID

2009-09-16 Thread Grant Ingersoll



On Sep 16, 2009, at 9:41 AM, Grant Ingersoll wrote:



On Sep 15, 2009, at 8:25 PM, Yonik Seeley wrote:


: .map={sku.field}:{id}

the map param is for replacing a *value* with a different'  
value ... it's
useful for things like numeric codes in CSV files that you want to  
replace

with strings in your index.


Darn... I shouldn't trust my memory.
From http://issues.apache.org/jira/browse/SOLR-284
'''drop "ext." from parameter names, and revisit naming to try and
unify with other update handlers like CSV'''

So now map.a=b in CSV is for values but map.a=b in SolrCell is for  
fields

perhaps we should change map in SolrCell to fmap?


That's fine by me.  Just update the docs when you're done.


Actually, I can do this now.





My longer range idea was to pull out some generally useful things  
like

field mapping, etc, such that they could be shared across update
handlers.


See also:
SOLR-1032, SOLR-1069 for related things.  We should be able to  
refactor the field mapping code easy enough.




-Yonik
http://www.lucidimagination.com


-- Forwarded message --
From: Chris Hostetter 
Date: Tue, Sep 15, 2009 at 8:12 PM
Subject: Re: CSV Update - Need help mapping csv field to schema's ID
To: solr-u...@lucene.apache.org



: I would like to add an additional name:value pair for every line,  
mapping the

: sku field to my schema's id field:
:
: .map={sku.field}:{id}

the map param is for replacing a *value* with a different'  
value ... it's
useful for things like numeric codes in CSV files that you want to  
replace

with strings in your index.

: I would prefer NOT to change the schema by adding a source="sku"

: dest="id"/>.

that's the only solution i can think of unless you want to write an
UpdateProcessor.


-Hoss


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search

Re: Solr 1.4 Open Issues Status

2009-09-16 Thread Grant Ingersoll

On Sep 15, 2009, at 12:01 PM, Andrzej Bialecki wrote:

Grant Ingersoll wrote:

Here's where we are at for 1.4.  My comments are marked by >.
I think we are in pretty good shape, people just need to make some  
final commits. If things are still unassigned tomorrow morning, I'm  
going to push them to 1.5.

KeySummaryAssignee
SOLR-1427SearchComponents aren't listed on registry.jsp 
Grant Ingersoll

> I just put up a patch that I believe is ready to commit.
SOLR-1423Lucene 2.9 RC4 may need some changes in Solr Analyzers  
using CharStream & othersKoji Sekiguchi

> Koji?
SOLR-1407SpellingQueryConverter now disallows underscores and  
digits in field names (but allows all UTF-8 letters)Shalin  
Shekhar Mangar

>Needs a patch and a unit test.  Push to 1.5?
SOLR-1396standardize the updateprocessorchain syntax 
Unassigned
> No patch exists and no work has been done on it.  Seems like we  
should get this right.  Volunteers?
SOLR-1366UnsupportedOperationException may be thrown when
using custom IndexReaderMark Miller

> Patch exists.  Mark?

That patch doesn't solve the issue - it can't be solved without  
serious changes in the replication handler. For now we can only  
clarify the breakage in the documentation.

Care to take up that documentation, Andrzej?

Re: CSV Update - Need help mapping csv field to schema's ID

2009-09-16 Thread Insight 49, LLC


Darn. I hate when I create work for people.

My need is to take a csv file, use the CSV update handler, but then add 
an additional copyfield (sku from csv to id from schema) to create a 
unique id for each record.


Thanks guys. Terrific work on SOLR.

Dan


Grant Ingersoll wrote:


On Sep 15, 2009, at 8:25 PM, Yonik Seeley wrote:


Darn... I shouldn't trust my memory.
From http://issues.apache.org/jira/browse/SOLR-284
'''drop "ext." from parameter names, and revisit naming to try and
unify with other update handlers like CSV'''

So now map.a=b in CSV is for values but map.a=b in SolrCell is for 
fields

perhaps we should change map in SolrCell to fmap?


That's fine by me.  Just update the docs when you're done.



My longer range idea was to pull out some generally useful things like
field mapping, etc, such that they could be shared across update
handlers.


See also:
 SOLR-1032, SOLR-1069 for related things.  We should be able to refactor 
the field mapping code easy enough.

Re: Solr 1.4 Open Issues Status

2009-09-16 Thread Grant Ingersoll



On Sep 15, 2009, at 6:23 PM, Chris Hostetter wrote:



: > > Hoss' patch is a reasonable start. I think this can be  
committed. We

: > can iterate in 1.5. Mark or Hoss?
: I think this is a great start for 1.4 and the rest can wait till  
1.5,

: but I'll defer to Hoss. I had started working on something more
: complicated, but I prefer Hoss' route.

Mark: my patch was just something i cranked out really quick and  
dirty to
sanity test that various solr components weren't already causing  
insanity

... feel free to run with it, i'm a little burnt out right now just
trying to keep up with email.



I vote we go with it for now and then improve in 1.5

[jira] Commented: (SOLR-1316) Create autosuggest component

2009-09-16 Thread Ankul Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756072#action_12756072
 ] 

Ankul Garg commented on SOLR-1316:
--

Removing keys shall not affect the balancing of the tree as it can be easily
done by making the boolean end at the leaf as false. Adding keys dynamically
wont really keep the tree balanced in my implementation, as in my
implementation the tree is balanced by ordered insertion of keys. So while
adding more keys, the TST will have to be rebuilt to make it balanced. Will
that be problematic?




> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 1.5
>
> Attachments: TernarySearchTree.tar.gz
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1433) files included in release that shouldn't be

2009-09-16 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll reassigned SOLR-1433:
-

Assignee: Grant Ingersoll

> files included in release that shouldn't be
> ---
>
> Key: SOLR-1433
> URL: https://issues.apache.org/jira/browse/SOLR-1433
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
>
> some files are making it into the release artifacts that shouldn't be ... 
> need to take care of this in the build file prior to releasing 1.4.  details 
> to follow in comments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1437) DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.

2009-09-16 Thread Fergus McMenemie (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756051#action_12756051
 ] 

Fergus McMenemie commented on SOLR-1437:


A pity we may not make the 1.4 release, but I guess there is no harm in trying!

Looking through the code for XPathRecordReader I see a variable skipNextEvent 
inside the parse method. Can anybody explain why we need to skip an event at 
the end of a text block?

> DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.
> 
>
> Key: SOLR-1437
> URL: https://issues.apache.org/jira/browse/SOLR-1437
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Fergus McMenemie
>Priority: Minor
> Fix For: 1.5
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> As per 
> http://www.nabble.com/Re%3A-Extract-info-from-parent-node-during-data-import-%28redirect%3A%29-td25471162.html
>  it would be nice to be able to use expressions such as //tagname when 
> parsing XML documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1316) Create autosuggest component

2009-09-16 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756050#action_12756050
 ] 

Andrzej Bialecki  commented on SOLR-1316:
-

bq. These enable suffix compression and create much smaller word graphs.

DAWGs are problematic, because they are essentially immutable once created (the 
cost of insert / delete is very high). So I propose to stick to TSTs for now.

Also, I think that populating TST from the index would have to be 
discriminative, perhaps based on a threshold (so that it only adds terms with 
large enough docFreq), and it would be good to adjust the content of the tree 
based on actual queries that return some results (poor man's auto-learning), 
gradually removing least frequent strings to save space.. We could also use as 
a source a field with 1-3 word shingles (no tf, unstored, to save space in the 
source index, with a similar thresholding mechanism).

Ankul, I'm not sure what's the behavior of your implementation when dynamically 
adding / removing keys? Does it still remain balanced?

I also found a MIT-licensed  impl. of radix tree here: 
http://code.google.com/p/radixtree, which looks good too, one spelling mistake 
in the API notwithstanding ;)


> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 1.5
>
> Attachments: TernarySearchTree.tar.gz
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1314) Upgrade Carrot2 to version 3.1.0

2009-09-16 Thread Grant Ingersoll (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756049#action_12756049
 ] 

Grant Ingersoll commented on SOLR-1314:
---

bq. As a follow-up of the discussion on legal-discuss

OK, I think that leaves only the patent wording.  My takeaway from the 
legal-discuss thread is that particular line doesn't hold water, so you 
probably could just drop it.  At a minimum, it needs to make explicit it 
pertains to Carrot2 and not be ambiguous as it is now.

Thanks!

> Upgrade Carrot2 to version 3.1.0
> 
>
> Key: SOLR-1314
> URL: https://issues.apache.org/jira/browse/SOLR-1314
> Project: Solr
>  Issue Type: Task
>Reporter: Stanislaw Osinski
>Assignee: Grant Ingersoll
> Fix For: 1.4
>
>
> As soon as Lucene 2.9 is releases, Carrot2 3.1.0 will come out with bug fixes 
> in clustering algorithms and improved clustering in Chinese. The upgrade 
> should be a matter of upgrading {{carrot2-mini.jar}} and 
> {{google-collections.jar}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: distinct example for Solr Cell?

2009-09-16 Thread Yonik Seeley

On Tue, Sep 15, 2009 at 7:31 PM, Chris Hostetter
 wrote:
> I remember a discussion about removing the /update/extract handler from
> ./example/solr/conf/solrconfig.xml so that we could stop copying all the
> jars into ./example/solr/lib/ and have a smaller, simpler, example.

I don't recall that...  I did make the extract handler lazy such that
one could easily remove all of the tika jars from the example w/o
triggering an exception.  We should look into updating the example
README at a minimum.

We should certainly strive for simplicity, but that can go either
way... I like the "batteries included" mentality of python too.

Future idea:
 - make "example" slightly more formal by naming it "server"
 - make server/solr/lib the home for some of these jars (preferably
separated by sub-directory) and make compilation and tests go against
these jars

That would keep the "server" dir self contained (no outside references
- copy it somewhere else to deploy), make our download smaller, and
eliminate the copying around of libs.

> The
> idea being that then there would be a seperate distinct set of configs
> providing an example of the extraction handler (with all of it's jars)

If it's an example with all of it's jars, it seems like it's still a
copy of all those jars, right?
Or, we could put the example in contrib/extracting and make it such
that the code and example server shared the libraries?

-Yonik
http://www.lucidimagination.com

Re: CSV Update - Need help mapping csv field to schema's ID

2009-09-16 Thread Grant Ingersoll



On Sep 15, 2009, at 8:25 PM, Yonik Seeley wrote:


: .map={sku.field}:{id}

the map param is for replacing a *value* with a different'  
value ... it's
useful for things like numeric codes in CSV files that you want to  
replace

with strings in your index.


Darn... I shouldn't trust my memory.
From http://issues.apache.org/jira/browse/SOLR-284
'''drop "ext." from parameter names, and revisit naming to try and
unify with other update handlers like CSV'''

So now map.a=b in CSV is for values but map.a=b in SolrCell is for  
fields

perhaps we should change map in SolrCell to fmap?


That's fine by me.  Just update the docs when you're done.



My longer range idea was to pull out some generally useful things like
field mapping, etc, such that they could be shared across update
handlers.


See also:
 SOLR-1032, SOLR-1069 for related things.  We should be able to  
refactor the field mapping code easy enough.




-Yonik
http://www.lucidimagination.com


-- Forwarded message --
From: Chris Hostetter 
Date: Tue, Sep 15, 2009 at 8:12 PM
Subject: Re: CSV Update - Need help mapping csv field to schema's ID
To: solr-u...@lucene.apache.org



: I would like to add an additional name:value pair for every line,  
mapping the

: sku field to my schema's id field:
:
: .map={sku.field}:{id}

the map param is for replacing a *value* with a different' value ...  
it's
useful for things like numeric codes in CSV files that you want to  
replace

with strings in your index.

: I would prefer NOT to change the schema by adding a source="sku"

: dest="id"/>.

that's the only solution i can think of unless you want to write an
UpdateProcessor.


-Hoss


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search

[jira] Updated: (SOLR-1437) DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1437:
-

Fix Version/s: (was: 1.4)
   1.5

it may not be viable to target this for 1.4

> DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.
> 
>
> Key: SOLR-1437
> URL: https://issues.apache.org/jira/browse/SOLR-1437
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Fergus McMenemie
>Priority: Minor
> Fix For: 1.5
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> As per 
> http://www.nabble.com/Re%3A-Extract-info-from-parent-node-during-data-import-%28redirect%3A%29-td25471162.html
>  it would be nice to be able to use expressions such as //tagname when 
> parsing XML documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1437) DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.

2009-09-16 Thread Fergus McMenemie (JIRA)

DIH: Enhance XPathRecordReader to deal with //tagname and other improvments.


 Key: SOLR-1437
 URL: https://issues.apache.org/jira/browse/SOLR-1437
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: Fergus McMenemie
Priority: Minor
 Fix For: 1.4


As per 
http://www.nabble.com/Re%3A-Extract-info-from-parent-node-during-data-import-%28redirect%3A%29-td25471162.html
 it would be nice to be able to use expressions such as //tagname when parsing 
XML documents.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1407) SpellingQueryConverter now disallows underscores and digits in field names (but allows all UTF-8 letters)

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1407:


Attachment: SOLR-1407.patch

# Removed NMTOKEN from value because it will disallow special characters such 
as comma, period etc.
# Any and all characters are permitted in a value except a space character 
(which is the delimiter)
# Added test for the above

The SpellingQueryConverter still breaks for phrase queries which have a space 
in them like field_s:"foo bar". But this issue existed in 1.3 too.

I'll commit this soon.

> SpellingQueryConverter now disallows underscores and digits in field names 
> (but allows all UTF-8 letters)
> -
>
> Key: SOLR-1407
> URL: https://issues.apache.org/jira/browse/SOLR-1407
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: David Bowen
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1407.patch, SOLR-1407.patch, 
> SpellingQueryConverter.java, SpellingQueryConverter.java
>
>
> SpellingQueryConverter was extended to cover the full UTF-8 range instead of 
> handling US-ASCII only, but in the process it was broken for field names that 
> contain underscores or digits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1407) SpellingQueryConverter now disallows underscores and digits in field names (but allows all UTF-8 letters)

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755994#action_12755994
 ] 

Shalin Shekhar Mangar commented on SOLR-1407:
-

bq. Looks good, the only thing I can see doing is moving to incrementToken() 
instead of next(), but that isn't required just yet. 

Thanks, I'll commit this then.

> SpellingQueryConverter now disallows underscores and digits in field names 
> (but allows all UTF-8 letters)
> -
>
> Key: SOLR-1407
> URL: https://issues.apache.org/jira/browse/SOLR-1407
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: David Bowen
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1407.patch, SpellingQueryConverter.java, 
> SpellingQueryConverter.java
>
>
> SpellingQueryConverter was extended to cover the full UTF-8 range instead of 
> handling US-ASCII only, but in the process it was broken for field names that 
> contain underscores or digits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1436) Consider changing multi-term queries to use CONSTANT_SCORE_AUTO_REWRITE_DEFAULT

2009-09-16 Thread Mark Miller (JIRA)

Consider changing multi-term queries to use CONSTANT_SCORE_AUTO_REWRITE_DEFAULT 


 Key: SOLR-1436
 URL: https://issues.apache.org/jira/browse/SOLR-1436
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Priority: Minor


They are CONSTANT_SCORE_AUTO_REWRITE_DEFAULT now, but 
ConstantScoreBooleanQueryRewrite can be faster and 
CONSTANT_SCORE_AUTO_REWRITE_DEFAULT is likely the best setting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1407) SpellingQueryConverter now disallows underscores and digits in field names (but allows all UTF-8 letters)

2009-09-16 Thread Grant Ingersoll (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755976#action_12755976
 ] 

Grant Ingersoll commented on SOLR-1407:
---

Looks good, the only thing I can see doing is moving to incrementToken() 
instead of next(), but that isn't required just yet.

> SpellingQueryConverter now disallows underscores and digits in field names 
> (but allows all UTF-8 letters)
> -
>
> Key: SOLR-1407
> URL: https://issues.apache.org/jira/browse/SOLR-1407
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: David Bowen
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1407.patch, SpellingQueryConverter.java, 
> SpellingQueryConverter.java
>
>
> SpellingQueryConverter was extended to cover the full UTF-8 range instead of 
> handling US-ASCII only, but in the process it was broken for field names that 
> contain underscores or digits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Solr-trunk #926

2009-09-16 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Solr-trunk/926/

--
[...truncated 443 lines...]
A src/test/test-files
A src/test/test-files/solr
A src/test/test-files/solr/crazy-path-to-schema.xml
A src/test/test-files/solr/crazy-path-to-config.xml
A src/test/test-files/solr/conf
AUsrc/test/test-files/solr/conf/solrconfig-duh-optimize.xml
AUsrc/test/test-files/solr/conf/solrconfig_perf.xml
AUsrc/test/test-files/solr/conf/schema-required-fields.xml
AUsrc/test/test-files/solr/conf/elevate.xml
AUsrc/test/test-files/solr/conf/schema-replication1.xml
AUsrc/test/test-files/solr/conf/solrconfig-transformers.xml
AUsrc/test/test-files/solr/conf/schema-replication2.xml
A src/test/test-files/solr/conf/xslt
A src/test/test-files/solr/conf/xslt/dummy.xsl
AUsrc/test/test-files/solr/conf/solrconfig-master.xml
AUsrc/test/test-files/solr/conf/solrconfig-slave1.xml
A src/test/test-files/solr/conf/schema.xml
AUsrc/test/test-files/solr/conf/schema11.xml
AUsrc/test/test-files/solr/conf/schema-spellchecker.xml
A src/test/test-files/solr/conf/stop-1.txt
A src/test/test-files/solr/conf/stop-2.txt
AUsrc/test/test-files/solr/conf/schema12.xml
AUsrc/test/test-files/solr/conf/solrconfig-nocache.xml
AUsrc/test/test-files/solr/conf/schema-not-required-unique-key.xml
AUsrc/test/test-files/solr/conf/solrconfig-spellchecker.xml
AUsrc/test/test-files/solr/conf/solrconfig-altdirectory.xml
A src/test/test-files/solr/conf/solrconfig-enableplugin.xml
AUsrc/test/test-files/solr/conf/solrconfig-querysender.xml
AUsrc/test/test-files/solr/conf/solrconfig-facet-sort.xml
AUsrc/test/test-files/solr/conf/solrconfig-slave.xml
AUsrc/test/test-files/solr/conf/schema-reversed.xml
A src/test/test-files/solr/conf/synonyms.txt
AUsrc/test/test-files/solr/conf/solrconfig-functionquery.xml
AUsrc/test/test-files/solr/conf/solrconfig-master1.xml
AUsrc/test/test-files/solr/conf/solrconfig-master2.xml
A src/test/test-files/solr/conf/protwords.txt
A src/test/test-files/solr/conf/stopwords.txt
AUsrc/test/test-files/solr/conf/bad-schema.xml
AUsrc/test/test-files/solr/conf/schema-minimal.xml
A src/test/test-files/solr/conf/schema-binaryfield.xml
A src/test/test-files/solr/conf/solrconfig-solcoreproperties.xml
AUsrc/test/test-files/solr/conf/solrconfig-elevate.xml
AUsrc/test/test-files/solr/conf/mapping-ISOLatin1Accent.txt
AUsrc/test/test-files/solr/conf/schema-copyfield-test.xml
AUsrc/test/test-files/solr/conf/schema-trie.xml
A src/test/test-files/solr/conf/keep-1.txt
A src/test/test-files/solr/conf/keep-2.txt
AUsrc/test/test-files/solr/conf/solrconfig-termindex.xml
AUsrc/test/test-files/solr/conf/solrconfig-SOLR-749.xml
A src/test/test-files/solr/conf/schema-stop-keep.xml
AUsrc/test/test-files/solr/conf/solrconfig.xml
AUsrc/test/test-files/solr/conf/solrconfig-delpolicy1.xml
AUsrc/test/test-files/solr/conf/solrconfig-delpolicy2.xml
AUsrc/test/test-files/solr/conf/solrconfig-highlight.xml
A src/test/test-files/solr/conf/bad_solrconfig.xml
A src/test/test-files/solr/shared
A src/test/test-files/solr/shared/conf
AUsrc/test/test-files/solr/shared/conf/schema.xml
AUsrc/test/test-files/solr/shared/conf/stopwords-en.txt
AUsrc/test/test-files/solr/shared/conf/solrconfig.xml
AUsrc/test/test-files/solr/shared/conf/stopwords-fr.txt
AUsrc/test/test-files/solr/shared/solr.xml
AUsrc/test/test-files/sampleDateFacetResponse.xml
AUsrc/test/test-files/mailing_lists.pdf
AUsrc/test/test-files/books.csv
AUsrc/test/test-files/htmlStripReaderTest.html
A src/test/test-files/README
AUsrc/test/test-files/spellings.txt
A src/test/org
A src/test/org/apache
A src/test/org/apache/solr
A src/test/org/apache/solr/update
A src/test/org/apache/solr/update/processor
AU
src/test/org/apache/solr/update/processor/UpdateRequestProcessorFactoryTest.java
AU
src/test/org/apache/solr/update/processor/SignatureUpdateProcessorFactoryTest.java
AU
src/test/org/apache/solr/update/processor/CustomUpdateRequestProcessorFactory.java
A src/test/org/apache/solr/update/AutoCommitTest.java
AUsrc/test/org/apache/solr/update/DocumentBuilderTest.java
AUsrc/test/org/apache/solr/update/TestIndexingPerformance.java
A src/test/org/apache/solr/update/DirectUpdateHandlerTest.java
AUsrc/test/org/apache/solr/update/DirectUpdateHandlerOptimizeTest.java
AUsrc/test/org/apache/solr/TestTrie.java
A src/test/org/apache/solr/analysis
AUsrc/tes

[jira] Updated: (SOLR-1407) SpellingQueryConverter now disallows underscores and digits in field names (but allows all UTF-8 letters)

2009-09-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1407:


Attachment: SOLR-1407.patch

# Uses Michael's NMTOKEN regex.
# Added tests chinese chars and special characters in field names/values

I added the same NMTOKEN for values also. Otherwise values which have an 
underscore or digit or hyphen are split into multiple tokens at these 
characters. I don't think that should happen. Grant, any thoughts?

> SpellingQueryConverter now disallows underscores and digits in field names 
> (but allows all UTF-8 letters)
> -
>
> Key: SOLR-1407
> URL: https://issues.apache.org/jira/browse/SOLR-1407
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: David Bowen
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1407.patch, SpellingQueryConverter.java, 
> SpellingQueryConverter.java
>
>
> SpellingQueryConverter was extended to cover the full UTF-8 range instead of 
> handling US-ASCII only, but in the process it was broken for field names that 
> contain underscores or digits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1435) ensure that all slaves with same pollInteval fetches index at same time

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1435:
-

Fix Version/s: 1.4

> ensure that all slaves with same pollInteval fetches index at same time
> ---
>
> Key: SOLR-1435
> URL: https://issues.apache.org/jira/browse/SOLR-1435
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 1.4
>
> Attachments: SOLR-1435.patch
>
>
> When pollInterval is set to be some value ensure that al slaves fetch index 
> at the same time (if their clocks are synchronized) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1435) ensure that all slaves with same pollInteval fetches index at same time

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1435:
-

Attachment: SOLR-1435.patch

> ensure that all slaves with same pollInteval fetches index at same time
> ---
>
> Key: SOLR-1435
> URL: https://issues.apache.org/jira/browse/SOLR-1435
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-1435.patch
>
>
> When pollInterval is set to be some value ensure that al slaves fetch index 
> at the same time (if their clocks are synchronized) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1435) ensure that all slaves with same pollInteval fetches index at same time

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1435:
-

Attachment: (was: SOLR-1435.patch)

> ensure that all slaves with same pollInteval fetches index at same time
> ---
>
> Key: SOLR-1435
> URL: https://issues.apache.org/jira/browse/SOLR-1435
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Noble Paul
>
> When pollInterval is set to be some value ensure that al slaves fetch index 
> at the same time (if their clocks are synchronized) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1435) ensure that all slaves with same pollInteval fetches index at same time

2009-09-16 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1435:
-

Attachment: SOLR-1435.patch

Should we fix this in Solr1.4

> ensure that all slaves with same pollInteval fetches index at same time
> ---
>
> Key: SOLR-1435
> URL: https://issues.apache.org/jira/browse/SOLR-1435
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
>Reporter: Noble Paul
>Assignee: Noble Paul
> Attachments: SOLR-1435.patch
>
>
> When pollInterval is set to be some value ensure that al slaves fetch index 
> at the same time (if their clocks are synchronized) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1435) ensure that all slaves with same pollInteval fetches index at same time

2009-09-16 Thread Noble Paul (JIRA)

ensure that all slaves with same pollInteval fetches index at same time
---

 Key: SOLR-1435
 URL: https://issues.apache.org/jira/browse/SOLR-1435
 Project: Solr
  Issue Type: Improvement
  Components: replication (java)
Reporter: Noble Paul
Assignee: Noble Paul


When pollInterval is set to be some value ensure that al slaves fetch index at 
the same time (if their clocks are synchronized) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

44 matches

Mail list logo