from:"Shalin Shekhar Mangar"

Re: Welcome Uwe Schindler to the Lucene PMC

2010-04-01 Thread Shalin Shekhar Mangar

Congratulations Uwe!

On Thu, Apr 1, 2010 at 4:35 PM, Grant Ingersoll gsing...@apache.org wrote:

 I'm pleased to announce that the Lucene PMC has voted to add Uwe Schindler
 to the PMC.  Uwe has been doing a lot of work in Lucene and Solr, including
 several of the last releases in Lucene.

 Please join me in extending congratulations to Uwe!

 -Grant Ingersoll
 PMC Chair
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-469) Data Import RequestHandler

2010-03-24 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849121#action_12849121
 ] 

Shalin Shekhar Mangar commented on SOLR-469:


Thanks!

Scheduling is not implemented inside Solr. You can use a cron job for 
scheduling automatic imports. For example, you can call wget 
http://solr.host:port/solr/dataimport?command=full-import;.

 Data Import RequestHandler
 --

 Key: SOLR-469
 URL: https://issues.apache.org/jira/browse/SOLR-469
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.3

 Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
 SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
 SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
 SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
 SOLR-469-contrib.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
 SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
 SOLR-469.patch, SOLR-469.patch, xpath-stream.patch


 We need a RequestHandler Which can import data from a DB or other dataSources 
 into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
 (SOLR-103).
 The way it works is as follows.
 * Provide a configuration file (xml) to the Handler which takes in the 
 necessary SQL queries and mappings to a solr schema
   - It also takes in a properties file for the data source 
 configuraution
 * Given the configuration it can also generate the solr schema.xml
 * It is registered as a RequestHandler which can take two commands 
 do-full-import, do-delta-import
   -  do-full-import - dumps all the data from the Database into the 
 index (based on the SQL query in configuration)
   - do-delta-import - dumps all the data that has changed since last 
 import. (We assume a modified-timestamp column in tables)
 * It provides a admin page
   - where we can schedule it to be run automatically at regular 
 intervals
   - It shows the status of the Handler (idle, full-import, 
 delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1799) enable matching of CamelCase with camelcase in WordDelimiterFilter

2010-03-15 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shalin Shekhar Mangar updated SOLR-1799:

Fix Version/s: (was: 1.3)
1.5

enable matching of CamelCase with camelcase in WordDelimiterFilter
--

Key: SOLR-1799
URL: https://issues.apache.org/jira/browse/SOLR-1799
Project: Solr
Issue Type: Improvement
Components: search
Affects Versions: 1.3, 1.4
Reporter: Chris Darroch
Priority: Minor
Fix For: 1.5

Attachments: SOLR-1799.patch

At the bottom of the WordDelimiterFilter.java code there's the following
comment:
// downsides: if source text is powershot then a query of PowerShot
won't match!
Another serious example for us might be something like an indexed document
containing the word Tribeca or Soho, and then a user trying to search for
TriBeCa or SoHo.
This issue has turned up in a couple of recent mailing list threads:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200908.mbox/%3cfe4f94830908201429j3ffbcdd3s3cb7d80542b31...@mail.gmail.com%3e
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200905.mbox/%3c72d9e9500905121619p68c27099ibc7079e52cb0e...@mail.gmail.com%3e
In the first thread I found the best explication of what my own
misunderstanding was, and it's something I'm sure must trip up other people
as well:
{quote}
I've misunderstood WordDelimiterFilter. You might think that catenateAll=1
would append the full phrase (sans delimiters) as an OR against the query.
So jOkersWild would produce:
j (okers wild) OR jokerswild
But you thought wrong. Its actually:
j (okers wild jokerswild)
Which is confusing and won't match...
{quote}
In the second thread, Yonik Seeley gives a good explanation of why this
occurs, and provides a suggested workaround where you duplicate your data
fields and then query on one using generateWordParts=1 and on the other
using catenateWords=1. That works, but obviously requires data
duplication. In our case, we are also following what I believe is
recommended practice and duplicating our data already into stemmed and
unstemmed indexes. To my mind, to further duplicate both of these fields a
second time, with no difference in the indexed data of the additional copy,
seems needlessly wasteful when the problem lies entirely in the query side of
things.
At any rate, I'm attaching a patch against Solr 1.3 which is rather hacky,
but seems to work for us. In WordDelimiterFilter, if generateWordParts=1
and catenateWords=2, then we move the concatenated word to overlap its
position with the first generated token instead of the last (which is the
behaviour with catenateWords=1). We further insert a preceding dummy flag
token with the special type CATENATE_FIRST.
In SolrPluginUtils in the DisjunctionMaxQueryParser class we just copy in the
entirety of the getFieldQuery() code from Lucene's QueryParser. This is
ugly, I know. This code is then tweaked so that in the case where the dummy
flag token is seen, it creates a BooleanQuery with the following token (the
concatenated word) as a conditional TermQuery clause, and then adds the
generated terms in their usual MultiPhraseQuery as a second conditional
clause.
Now I realize this patch is (a) not likely acceptable on style and elegance
grounds, and (b) only against Solr 1.3, not trunk. My apologies for both;
after I'd spent most of what time I had available tracking down the source of
the problem, I just needed to get something working quickly. Perhaps this
patch will inspire others to greatness, though, or at a minimum provide a
starting point for those who stumble over this same issue.
Thanks for a great application! Cheers.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1814:


Fix Version/s: (was: 1.4)

 select count(distinct fieldname) in SOLR
 

 Key: SOLR-1814
 URL: https://issues.apache.org/jira/browse/SOLR-1814
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Affects Versions: 1.5
Reporter: Marcus Herou
 Fix For: 1.5

 Attachments: CountComponent.java


 I have seen questions on the mailinglist about having the functionality for 
 counting distinct on a field. We at Tailsweep as well want to that in for 
 example our blogsearch.
 Example:
 You had 1345 hits on 244 blogs
 The 244 part is not possible in SOLR today (correct me if I am wrong). So 
 I've written a component which does this. Attaching it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1814:


Affects Version/s: (was: 2.0)
   (was: 1.6)
   (was: 1.4)
Fix Version/s: (was: 2.0)
   (was: 1.6)

 select count(distinct fieldname) in SOLR
 

 Key: SOLR-1814
 URL: https://issues.apache.org/jira/browse/SOLR-1814
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Affects Versions: 1.5
Reporter: Marcus Herou
 Fix For: 1.5

 Attachments: CountComponent.java


 I have seen questions on the mailinglist about having the functionality for 
 counting distinct on a field. We at Tailsweep as well want to that in for 
 example our blogsearch.
 Example:
 You had 1345 hits on 244 blogs
 The 244 part is not possible in SOLR today (correct me if I am wrong). So 
 I've written a component which does this. Attaching it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: removal of deprecated HtmlStrip*Tokenizer factories

2010-03-15 Thread Shalin Shekhar Mangar

On Tue, Mar 16, 2010 at 2:09 AM, Robert Muir rcm...@gmail.com wrote:

 Hello,

 Is there any concern with removing the deprecated HtmlStrip*Tokenizer
 factories?

 These can be done with CharFilter instead and they have some problems
 with lucene's trunk.

 If no one objects, I'd like to remove these in the branch.
 Otherwise, Uwe tells me there is some way to make them work if need be.


Is there a way we can fix LUCENE-2098 too?

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-1812) StreamingUpdateSolrServer creates an OutputStreamWriter that it never closes

2010-03-08 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12842717#action_12842717
 ] 

Shalin Shekhar Mangar commented on SOLR-1812:
-

Closing the OutputStreamWriter will close the underlying OutputStream. The 
HttpClient will automatically do that once the request has been sent so there 
is no leak here.

 StreamingUpdateSolrServer creates an OutputStreamWriter that it never closes
 

 Key: SOLR-1812
 URL: https://issues.apache.org/jira/browse/SOLR-1812
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.5




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1807) UpdateHandler plugin is not fully supported

2010-03-06 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12842369#action_12842369
 ] 

Shalin Shekhar Mangar commented on SOLR-1807:
-

UpdateHandler is an interface so instead of adding a method to it and breaking 
compatibility, we added it to the DirectUpdateHandler2 class. I guess the only 
way is to change the UpdateHandler interface.

 UpdateHandler plugin is not fully supported
 ---

 Key: SOLR-1807
 URL: https://issues.apache.org/jira/browse/SOLR-1807
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.4
Reporter: John Wang

 UpdateHandler is published as a supported Plugin, but code such as the 
 following:
 if (core.getUpdateHandler() instanceof DirectUpdateHandler2) {
 ((DirectUpdateHandler2) 
 core.getUpdateHandler()).forceOpenWriter();
   } else {
 LOG.warn(The update handler being used is not an instance or 
 sub-class of DirectUpdateHandler2.  +
 Replicate on Startup cannot work.);
   } 
 suggest that it is really not fully supported.
 Must all implementations of UpdateHandler be subclasses of 
 DirectUpdateHandler2 for it to work with replication?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Upgrading Lucene jars to 2.9.2 artifacts

2010-02-27 Thread Shalin Shekhar Mangar

Any objections?

-- 
Regards,
Shalin Shekhar Mangar.

Re: Upgrading Lucene jars to 2.9.2 artifacts

2010-02-27 Thread Shalin Shekhar Mangar

On Sat, Feb 27, 2010 at 8:04 PM, Mark Miller markrmil...@gmail.com wrote:

 On 02/27/2010 05:53 AM, Shalin Shekhar Mangar wrote:

 Any objections?




 Didn't rc2 (that we are on) end up being the final release?


Hmm, I didn't know that. But the lucene contrib jars checked in trunk are
different from the ones on Maven. The revision number is same but the
date/time of the build is different.

For example, the lucene-analyzers-2.9.2.jar:
Maven - Implementation-Version: 2.9.2 912433 - 2010-02-22 00:00:06
Trunk - Implementation-Version: 2.9.2 912433 - 2010-02-21 23:52:03

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)

2010-02-25 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838486#action_12838486
 ] 

Shalin Shekhar Mangar commented on SOLR-1752:
-

Jayson, Solr's update XML does not define a container tag so we are constrained 
to only one of add/delete/commit/optimize at a time. Binary format of course 
does not have this problem. So unless we decide to add a root tag to the update 
XML, this exception will happen.

So I guess we have the following options:
# Disallow more than one type of operation for any request writer
# Document this behavior in the UpdateRequest javadocs.

I'd prefer #2 even though it is inconsistent.

 SolrJ fails with exception when passing document ADD and DELETEs in the same 
 request using XML request writer (but not binary request writer)
 -

 Key: SOLR-1752
 URL: https://issues.apache.org/jira/browse/SOLR-1752
 Project: Solr
  Issue Type: Bug
  Components: clients - java, update
Affects Versions: 1.4
Reporter: Jayson Minard
Assignee: Shalin Shekhar Mangar
Priority: Blocker

 Add this test to SolrExampleTests.java and it will fail when using the XML 
 Request Writer (now default), but not if you change the SolrExampleJettyTest 
 to use the BinaryRequestWriter.
 {code}
  public void testAddDeleteInSameRequest() throws Exception {
 SolrServer server = getSolrServer();
 SolrInputDocument doc3 = new SolrInputDocument();
 doc3.addField( id, id3, 1.0f );
 doc3.addField( name, doc3, 1.0f );
 doc3.addField( price, 10 );
 UpdateRequest up = new UpdateRequest();
 up.add( doc3 );
 up.deleteById(id001);
 up.setWaitFlush(false);
 up.setWaitSearcher(false);
 up.process( server );
   }
 {code}
 terminates with exception:
 {code}
 Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots 
 (start tag in epilog?).
  at [row,col {unknown-source}]: [1,125]
   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
   at org.mortbay.jetty.Server.handle(Server.java:285)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
   at 
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
 Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
 roots (start tag in epilog?).
  at [row,col {unknown-source}]: [1,125]
   at 
 com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
   at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
   at 
 com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
   at 
 com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
   at 
 com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647)
   at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
   at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
   ... 18 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2010-02-17 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834933#action_12834933
 ] 

Shalin Shekhar Mangar commented on SOLR-1302:
-

Looking over the source of DistanceUtils#vectorDistance, it seems like there is 
a bug with calculating infinite norm:

Existing code:
{code}
for (int i = 0; i  vec1.length; i++) {
result = Math.max(vec1[i], vec2[i]);
}
{code}

Shouldn't that be:
{code}
for (int i = 0; i  vec1.length; i++) {
result = Math.max(result, Math.max(vec1[i], vec2[i]));
}
{code}

 Fun with Distances - Add Distance functions for a variety of things
 ---

 Key: SOLR-1302
 URL: https://issues.apache.org/jira/browse/SOLR-1302
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1302.patch, SOLR-1302.patch, SOLR-1302.patch


 There are many distance functions that are useful to have:
 1. Great Circle (lat/lon) and other geo distances
 2. Euclidean (Vector)
 3. Manhattan (Vector)
 4. Cosine (Vector)
 For the vector ones, the idea is that the fields on a document can be used to 
 determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2010-02-17 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834950#action_12834950
 ] 

Shalin Shekhar Mangar commented on SOLR-1302:
-

Done. Committed revision 911153.

 Fun with Distances - Add Distance functions for a variety of things
 ---

 Key: SOLR-1302
 URL: https://issues.apache.org/jira/browse/SOLR-1302
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1302.patch, SOLR-1302.patch, SOLR-1302.patch


 There are many distance functions that are useful to have:
 1. Great Circle (lat/lon) and other geo distances
 2. Euclidean (Vector)
 3. Manhattan (Vector)
 4. Cosine (Vector)
 For the vector ones, the idea is that the fields on a document can be used to 
 determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Field value tags

2010-02-14 Thread Shalin Shekhar Mangar

On Sat, Feb 13, 2010 at 11:18 PM, Peter S pete...@hotmail.com wrote:


 Hello Solr-dev,


 I've now implemented a QParserPlugin/QParser for tagging functionality in
 my internal Solr environment, and this is working very nicely.


 The type of functionality offered by tagging isn't currently in Solr, so I
 was thinking this might be a good plugin to contribute to the project.
 Before preparing the plugin for ASF-readiness, it would be great to get
 feedback, comments etc. on what the Solr dev experts think of including this
 sort of thing. If it's deemed useful for inclusion, I'll go ahead and create
 a JIRA issue and prepare the code for ASF.


 Here is a quick precis of what tagging offers:



 First off, for your typical user-based searching of 'shopping cart' or
 google-type doc-scored searching, tagging is probably not what you want.
 Dismax provides a much better fit for this type of searching.

 Tagging provides a means of entering a tag into a query, which, on the
 server (in the plugin) translates to some configured subquery that is
 actually executed by Solr.

 There are a number of cool use-cases for this - the 2 most salient of which
 are these:

 1. To provide a known 'key' at query time, that translates into subqueries
 that the user couldn't/wouldn't/shouldn't know at query time.

 For example, I use this to supply a tag called: 'admins', which, when
 entered into a query, will actually query for all documents that have some
 reference to all administrators/root users in the searched index(es). The
 [securely logged-in] person searching won't know who all the root users are
 (and the list will change over time), only that he/she wishes to find out
 information pertaining to their activity.

 2. To provide subquery 'shortcuts' for often used, usually lengthy and/or
 complicated queries.

 For example, if every morning, as part of your job, you need to search for:

 ((this AND that) OR (theother AND NOT somethingelse)) AND timestamp:[then
 TO now] . . .

 A tag can be made, say, 'mysearchtag' which equates to the above query.

 This tag can then be used as a query, and/or embedded in other queries.

 This is quite handy for automated searching and/or saved searches etc.

 This allows server administrators to control the content that gets returned
 by these queries, thus reducing client-side maintenance.

 Additionally, for distributed searches, evaluated tags can, if desired,
 produce different queries for different shards (e.g. the list of root users
 are different on different machines).

 Any comments, concerns, opinions etc. on a contributuion of this type would
 be greatly appreciated.


Thanks Peter. It definitely sounds useful for some use-cases. Can you open a
Jira issue and give a patch?

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833524#action_12833524
]

Shalin Shekhar Mangar commented on SOLR-1773:
-

Koji, have you looked at SOLR-1682? I gave an implementation of the same
approach but that too is only a PoC.

Field Collapsing (lightweight version)
--

Key: SOLR-1773
URL: https://issues.apache.org/jira/browse/SOLR-1773
Project: Solr
Issue Type: New Feature
Components: search
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Priority: Minor
Attachments: LOADTEST.patch, SOLR-1773.patch

I'd like to start another approach for field collapsing suggested by Yonik on
19/Dec/09 at SOLR-236. Re-posting the idea:
{code}
=== two pass collapsing algorithm for collapse.aggregate=max

First pass: pretend that collapseCount=1
- Use a TreeSet as a priority queue since one can remove and insert
entries.
- A HashMapKey,TreeSetEntry will be used to map from collapse group to
top entry in the TreeSet
- compare new doc with smallest element in treeset. If smaller discard and
go to the next doc.
- If new doc is bigger, look up it's group. Use the Map to find if the
group has been added to the TreeSet and add it if not.
- If the new bigger doc is already in the TreeSet, compare with the
document in that group. If bigger, update the node,
remove and re-add to the TreeSet to re-sort.
efficiency: the treeset and hashmap are both only the size of the top number
of docs we are looking at (10 for instance)
We will now have the top 10 documents collapsed by the right field with a
collapseCount of 1. Put another way, we have the top 10 groups.
Second pass (if collapseCount1):
- create a priority queue for each group (10) of size collapseCount
- re-execute the query (or if the sort within the collapse groups does not
involve score, we could just use the docids gathered during phase 1)
- for each document, find it's appropriate priority queue and insert
- optimization: we can use the previous info from phase1 to even avoid
creating a priority queue if no other items matched.
So instead of creating collapse groups for every group in the set (as is done
now?), we create it for only 10 groups.
Instead of collecting the score for every document in the set (40MB per
request for a 10M doc index is *big*) we re-execute the query if needed.
We could optionally store the score as is done now... but I bet aggregate
throughput on large indexes would be better by just re-executing.
Other thought: we could also cache the first phase in the query cache which
would allow one to quickly move to the 2nd phase for any collapseCount.
{code}
The restriction is:
{quote}
one would not be able to tell the total number of collapsed docs, or the
total number of hits (or the DocSet) after collapsing. So only
collapse.facet=before would be supported.
{quote}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1316) Create autosuggest component

2010-02-09 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831544#action_12831544
 ] 

Shalin Shekhar Mangar commented on SOLR-1316:
-

{quote}Where are we on this - do people feel it's ready to commit?{quote}

It has been some time since I looked at it but I don't feel it is ready. Using 
it through spellcheck works but specifying spell check params feels odd. Also, 
I don't know how well it compares to regular TermsComponent or facet.prefix 
searches in terms of memory and cpu cost.

 Create autosuggest component
 

 Key: SOLR-1316
 URL: https://issues.apache.org/jira/browse/SOLR-1316
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Jason Rutherglen
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: suggest.patch, suggest.patch, suggest.patch, TST.zip

   Original Estimate: 96h
  Remaining Estimate: 96h

 Autosuggest is a common search function that can be integrated
 into Solr as a SearchComponent. Our first implementation will
 use the TernaryTree found in Lucene contrib. 
 * Enable creation of the dictionary from the index or via Solr's
 RPC mechanism
 * What types of parameters and settings are desirable?
 * Hopefully in the future we can include user click through
 rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1768) Text Categorization Transformer

2010-02-09 Thread Shalin Shekhar Mangar (JIRA)

Text Categorization Transformer
---

 Key: SOLR-1768
 URL: https://issues.apache.org/jira/browse/SOLR-1768
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Shalin Shekhar Mangar
Priority: Minor


A Transformer which uses TCatNG - http://tcatng.sourceforge.net/ (BSD license) 
to categorize text.

See original discussion at - 
http://www.lucidimagination.com/search/document/37c1f48fb8224171/is_it_posible_to_exclude_results_from_other_languages

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: DB Connection

2010-02-05 Thread Shalin Shekhar Mangar

On Thu, Feb 4, 2010 at 8:08 PM, cjkadakia cjkada...@sonicbids.com wrote:


 Thanks. Next, can the 10 seconds be re-configured? We may likely want to
 keep
 the connection alive for a few minutes in case another commit is triggered.
 Is there any reason we may not want to consider this option?


Commit on Solr or DB? In any case, creating another connection after a few
minutes is not costly, so why complicate the code.

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Assigned: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)

2010-02-05 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1752:
---

Assignee: Shalin Shekhar Mangar

 SolrJ fails with exception when passing document ADD and DELETEs in the same 
 request using XML request writer (but not binary request writer)
 -

 Key: SOLR-1752
 URL: https://issues.apache.org/jira/browse/SOLR-1752
 Project: Solr
  Issue Type: Bug
  Components: clients - java, update
Affects Versions: 1.4
Reporter: Jayson Minard
Assignee: Shalin Shekhar Mangar
Priority: Blocker

 Add this test to SolrExampleTests.java and it will fail when using the XML 
 Request Writer (now default), but not if you change the SolrExampleJettyTest 
 to use the BinaryRequestWriter.
 {code}
  public void testAddDeleteInSameRequest() throws Exception {
 SolrServer server = getSolrServer();
 SolrInputDocument doc3 = new SolrInputDocument();
 doc3.addField( id, id3, 1.0f );
 doc3.addField( name, doc3, 1.0f );
 doc3.addField( price, 10 );
 UpdateRequest up = new UpdateRequest();
 up.add( doc3 );
 up.deleteById(id001);
 up.setWaitFlush(false);
 up.setWaitSearcher(false);
 up.process( server );
   }
 {code}
 terminates with exception:
 {code}
 Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots 
 (start tag in epilog?).
  at [row,col {unknown-source}]: [1,125]
   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
   at org.mortbay.jetty.Server.handle(Server.java:285)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
   at 
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
 Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
 roots (start tag in epilog?).
  at [row,col {unknown-source}]: [1,125]
   at 
 com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
   at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
   at 
 com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
   at 
 com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
   at 
 com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647)
   at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
   at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
   ... 18 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1741) NPE when deletionPolicy sets maxOptimizedCommitsTokeep=0

2010-02-05 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1741:
---

Assignee: Shalin Shekhar Mangar

 NPE when deletionPolicy sets maxOptimizedCommitsTokeep=0
 

 Key: SOLR-1741
 URL: https://issues.apache.org/jira/browse/SOLR-1741
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5


 This is a user reported issue http://markmail.org/thread/bjcwiw3s66b5x76h

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: DB Connection

2010-02-04 Thread Shalin Shekhar Mangar

On Thu, Feb 4, 2010 at 12:54 AM, cjkadakia cjkada...@sonicbids.com wrote:


 I looked at some of the references to see if this has been explained or
 not,
 but I didn't see anything regarding it. I was wondering, quite simply, if
 the SQL Server connection from Solr during indexing is kept alive for all
 subsequent delta-import requests, or does it reopen the connection each
 time
 and close it after it's finished?


DataImportHandler re-opens connection if it has not been used for the last
10 seconds. Connections are created at the start of an import and closed
once the import finishes or is aborted.

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Resolved: (SOLR-1701) Off-by-one error in calculating numFound in Distributed Search

2010-01-12 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1701.
-

   Resolution: Invalid
Fix Version/s: (was: 1.5)

Stupid mistake. Used delQ instead of del :(

 Off-by-one error in calculating numFound in Distributed Search
 --

 Key: SOLR-1701
 URL: https://issues.apache.org/jira/browse/SOLR-1701
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Shalin Shekhar Mangar
 Attachments: SOLR-1701.patch


 {code}
 // This passes
 query(q, *:*, sort, id asc, fl, id,text);
 // This also passes (notice the rows param)
 query(q, *:*, sort, id desc, rows, 12, fl, id,text);
 
 // But this fails
 query(q, *:*, sort, id desc, fl, id,text);
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1682) Implement CollapseComponent

2010-01-12 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799180#action_12799180
]

Shalin Shekhar Mangar commented on SOLR-1682:
-

bq. Shalin, I tried your patch out and I ran into a few problems with sorting
and the collapse counts which turned out to be bugs.

Thanks Martijn.

{quote}
Though I have a question about the response format. When collapse.threshold is
1 and more than one documents is collapsed then the collapse.count is named
group.size. The field group.numFound is then added as well. Why did you gave it
a different name?
{quote}

Actually I intended to rename collapse.value to group.value and
collapse.count to group.numFound but I forgot to do it in both the places.
* group.numFound = the total number of documents belonging to this group (i.e.
have the same group.value)
* group.size = the number of documents in this result set belonging to the same
group which is equal to min(group.numFound, collapse.threshold)

So when collapse.threshold = 1, group.size=1 and group.numFound will be equal
to the number of documents in the same group. Suppose collapse.threshold = 5,
but group.numFound=4 then group.size=4. The group.size is required to read all
docs belonging to the same group without having to maintain a set. Let me know
if you have suggestions for a better name than these.

{quote}
When collapse.threshold is larger than one two collectors are used. I
understand that in both situations a different algorithm is used. But now also
a search is done twice. Shouldn't it be better to have two complete distinct
collectors that don't depend on one another?
{quote}

We can have distinct collectors. The CollapsedDocCollector uses some of the
data that TopGroupCollector gathers and that is why it uses it directly. We
could keep references to the individual objects that are needed too. As I said,
this is just a PoC and not the final design.

I'll give a new patch with the names fixed for both the cases.

Implement CollapseComponent
---

Key: SOLR-1682
URL: https://issues.apache.org/jira/browse/SOLR-1682
Project: Solr
Issue Type: Sub-task
Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
Fix For: 1.5

Attachments: field-collapsing.patch, SOLR-1682.patch, SOLR-236.patch

Child issue of SOLR-236. This issue is dedicated to field collapsing in
general and all its code (CollapseComponent, DocumentCollapsers and
CollapseCollectors). The main goal is the finalize the request parameters and
response format.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1682) Implement CollapseComponent

2010-01-12 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1682:


Attachment: SOLR-1682.patch

Patch which fixes the inconsistent names for the meta fields.

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: field-collapsing.patch, SOLR-1682.patch, 
 SOLR-1682.patch, SOLR-236.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1720) replication configuration bug with multiple replicateAfter values

2010-01-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799600#action_12799600
 ] 

Shalin Shekhar Mangar commented on SOLR-1720:
-

Yonik, replicateAfter is supposed to be specified multiple times with different 
values. A single replicateAfter with comma separated value is invalid. So it is 
by design, not a bug. We could change that if you want.

 replication configuration bug with multiple replicateAfter values
 -

 Key: SOLR-1720
 URL: https://issues.apache.org/jira/browse/SOLR-1720
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Yonik Seeley
 Fix For: 1.5


 Jason reported problems with Multiple replicateAfter values - it worked after 
 changing to just commit
 http://www.lucidimagination.com/search/document/e4c9ba46dc03b031/replication_problem

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1680) Provide an API to specify custom Collectors

2010-01-07 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12797771#action_12797771
]

Shalin Shekhar Mangar commented on SOLR-1680:
-

bq. Why not broaden this and allow people to pass in their own collectors?

Yes, that is the general idea, though it would be API driven than
configuration. Any component should be able to pass a Collector to the various
SolrIndexSearcher methods.

bq. Also, can you explain a bit more the use case specifically for Field
Collapse?

Field Collapsing needs to use a custom collector. Right now the collector is
hard coded inside SolrIndexSearcher.

bq. Alternatively, given something like LUCENE-2127, we may want Solr to be
able to make query time decisions about what Collector to use.

I guess that decision should be made by QueryComponent? If so, then the ability
to pass a custom Collector to SolrIndexSearcher methods should be enough.

Provide an API to specify custom Collectors
---

Key: SOLR-1680
URL: https://issues.apache.org/jira/browse/SOLR-1680
Project: Solr
Issue Type: Sub-task
Components: search
Affects Versions: 1.3
Reporter: Martijn van Groningen
Fix For: 1.5

Attachments: field-collapse-core.patch, SOLR-1680.patch

The issue is dedicated to incorporate fieldcollapse's changes to the Solr's
core code.
We want to make it possible for components to specify custom Collectors in
SolrIndexSearcher methods.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1705) Move QueryConvertor into SpellCheckComponent configuration

2010-01-06 Thread Shalin Shekhar Mangar (JIRA)

Move QueryConvertor into SpellCheckComponent configuration
--

 Key: SOLR-1705
 URL: https://issues.apache.org/jira/browse/SOLR-1705
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5


QueryConvertor is a top level XML tag in solrconfig.xml but it is used by 
SpellCheckComponent only. Deprecate the current queryConvertor configuration 
and move it inside SpellCheckComponent configurationl.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1701) Off-by-one error in calculating numFound in Distributed Search

2010-01-05 Thread Shalin Shekhar Mangar (JIRA)

Off-by-one error in calculating numFound in Distributed Search
--

 Key: SOLR-1701
 URL: https://issues.apache.org/jira/browse/SOLR-1701
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Shalin Shekhar Mangar
 Fix For: 1.5


{code}
// This passes
query(q, *:*, sort, id asc, fl, id,text);

// This also passes (notice the rows param)
query(q, *:*, sort, id desc, rows, 12, fl, id,text);

// But this fails
query(q, *:*, sort, id desc, fl, id,text);
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1701) Off-by-one error in calculating numFound in Distributed Search

2010-01-05 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1701:


Attachment: SOLR-1701.patch

Test to demonstrate the bug

 Off-by-one error in calculating numFound in Distributed Search
 --

 Key: SOLR-1701
 URL: https://issues.apache.org/jira/browse/SOLR-1701
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1701.patch


 {code}
 // This passes
 query(q, *:*, sort, id asc, fl, id,text);
 // This also passes (notice the rows param)
 query(q, *:*, sort, id desc, rows, 12, fl, id,text);
 
 // But this fails
 query(q, *:*, sort, id desc, fl, id,text);
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1212) TestNG Test Case

2010-01-04 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796123#action_12796123
]

Shalin Shekhar Mangar commented on SOLR-1212:
-

bq. Keeping this out of the codebase would result in the patch being out of
sync with the tree. If there were no licensing restrictions - what is the harm
in having this in the tree.

You wrote this because you needed it at work and I appreciate that you thought
about contributing it to Solr. But from Solr's perspective it is not needed and
therefore I don't see why we should ship it at all. It is a class that is not
used by Solr but would need to be maintained by us if we ship it.

TestNG Test Case
-

Key: SOLR-1212
URL: https://issues.apache.org/jira/browse/SOLR-1212
Project: Solr
Issue Type: New Feature
Components: clients - java
Affects Versions: 1.4
Environment: Java 6
Reporter: Kay Kay
Fix For: 1.5

Attachments: SOLR-1212.patch, testng-5.9-jdk15.jar

Original Estimate: 1h
Remaining Estimate: 1h

TestNG equivalent of AbstractSolrTestCase , without using JUnit altogether .
New Class created: AbstractSolrNGTest
LICENSE.txt , NOTICE.txt modified as appropriate. ( TestNG under Apache
License 2.0 )
TestNG 5.9-jdk15 added to lib.
Justification: In some workplaces - people are moving towards TestNG and
take out JUnit altogether from the classpath. Hence useful in those cases.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1602) Refactor SOLR package structure to include o.a.solr.response and move QueryResponseWriters in there

2010-01-03 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796108#action_12796108
 ] 

Shalin Shekhar Mangar commented on SOLR-1602:
-

{quote}
One that springs to mind is updateRequestProcessor going to 
updateRequestProcessorChain.
{quote}

Patrick, I think that change was made in trunk before update processors were 
ever released.

{quote}
I'm both a software developer and a user of SOLR, and the consistent resistance 
to any proposed refactoring is quite troubling.
{quote}

The resistance is not towards refactoring. We are arguing about compatibility, 
not refactoring.

{quote}
And as I noted from a code organization standpoint, placing classes named 
response in a package named request is not subjectively anything - it's poor 
design and it needs to be addressed.
{quote}

I bet 99% of the users do not care about a wrongly named package when 
everything works. But they care when things stop working. Code organization is 
secondary to usability. Let us not cause discomfort to our users for such a 
trivial issue.

{quote}
As for no apparent reason as I mentioned to Noble, end-users of a system 
don't dictate its code-level organization/design.
{quote}

End users do not dictate code level organization but they do have an influence 
when compatibility is involved. In this case, it is an inconvenience for many 
of them which can be avoided easily, so why not?

I agree with Hoss. This is too much discussion over too small an issue. I think 
things are quite clear. Hoss, Erik, Noble and I all feel that breaking 
compatibility is not worth it. So lets do what needs to be done and get on with 
more important things.

 Refactor SOLR package structure to include o.a.solr.response and move 
 QueryResponseWriters in there
 ---

 Key: SOLR-1602
 URL: https://issues.apache.org/jira/browse/SOLR-1602
 Project: Solr
  Issue Type: Improvement
  Components: Response Writers
Affects Versions: 1.2, 1.3, 1.4
 Environment: independent of environment (code structure)
Reporter: Chris A. Mattmann
Assignee: Noble Paul
 Fix For: 1.5

 Attachments: SOLR-1602.Mattmann.112509.patch.txt, 
 SOLR-1602.Mattmann.112509_02.patch.txt, upgrade_solr_config


 Currently all o.a.solr.request.QueryResponseWriter implementations are 
 curiously located in the o.a.solr.request package. Not only is this package 
 getting big (30+ classes), a lot of them are misplaced. There should be a 
 first-class o.a.solr.response package, and the response related classes 
 should be given a home there. Patch forthcoming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1682) Implement CollapseComponent

2009-12-30 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1682:
---

Assignee: Shalin Shekhar Mangar

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: field-collapsing.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1682) Implement CollapseComponent

2009-12-30 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1682:


Attachment: SOLR-236.patch

Here's an implementation based on [Yonik's 
suggestion|https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12792916page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12792916].

This is just a PoC and not fit to be committed. This implementation uses one 
pass for collapse.threshold=1 and two passes for collapse.threshold1 so it 
should be a lot faster than the previous method. Though, I haven't benchmarked 
yet. Memory consumption should be proportional to start+count instead of index 
size.

What is covered:
# Non-adjacent collapsing
# collapse.threshold
# [New response 
format|https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12793101page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793101]
# Includes DocSetAwareCollector interface from SOLR-1680

What is not covered:
# Adjacent collapsing
# Aggregate functions (should be easy to add)
# Faceting (it doesn't keep/return the docsets needed for FacetComponent)
# Caching
# This implementation does not return the correct numFound

The response adds special fields to only the first document in a group. Here's 
a sample of the first document in a group:
{code:xml}
doc
  int name=id1/int
  str name=name_s1author1/str
  str name=title_s1a tree/str
  date name=timestamp2009-12-30T10:16:51.944Z/date
  arr name=multiDefault
strmuLti-Default/str
  /arr
  int name=intDefault42/int
  str name=collapse.valueauthor1/str
  int name=collapse.count1/int
  float name=score0.67107505/float
/doc
{code}

See TestCollapseComponent.java for example usage.

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: field-collapsing.patch, SOLR-236.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1688) Inner class FieldCacheSources should be refactored into their own classes

2009-12-28 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794782#action_12794782
 ] 

Shalin Shekhar Mangar commented on SOLR-1688:
-

{quote}
IMO, most of these should remain implementation details (i.e. not public)... 
they weren't thought out in sufficient detail to support as public classes (and 
there has been little reason to do so).
If we need StrValueSource to be public for another issue, then we should limit 
the change to that. 
{quote}

+1

As they say, lets not fix what ain't broken.

{quote}
If they are defined as a core data structure part of the JDK, then I would 
say yes. It's not as black and white of a line as you make it out to be. You 
can have SOLR be entirely a plugin-based system, with nothing but configuration 
inside of SVN, or you can have every piece of code that interacts with SOLR be 
inside the SOLR SVN. Neither solution will work, you have to strike a balance. 
The same applies for code organization and using absolutes or extremes doesn't 
really illustrate much.
{quote}

Chris, we are striving for balance and we are OK with the change to 
StrFieldSource. In this particular case, you seem to be pushing towards 
extremes in the name of consistency.

{quote}
Can you tell me the reason that e.g., StrFieldSource exists inside of StrField 
while DoubleFieldSource exists outside of DoubleField? Or why the other 4 or 5 
FieldSources that are defined inside of their own java file exist there, while 
the other 4 or 5 defined inside of the FieldType's java file exist there? 
What's the litmus test?
{quote}

It is not a public API and I guess that at the time it was written, there was 
no reason to make it one. It was convenient or a matter of personal style or 
most likely a random choice. There is no litmus test and there does not have to 
be one.

{quote}
Because it's more consistent, and thus, more maintainable.
{quote}

Actually it is the other way round. Once you make it public, it is harder to 
maintain. All changes should then be backward compatible as far as possible. 
The bottom line is that making all of them public is not needed. Your opinion 
is that it is broken because it is not consistent. My opinion is that it is OK 
and it does not matter. We shouldn't lean towards making something a public API 
in the name of consistency.

{quote}
Because when you tell someone to modify one of the core FieldSources or 
ValueSources, they know where to look instead of, oh is this one inside of a 
class inside of o.a.solr.schema, or is this one in the o.a.solr.search.function 
package?
{quote}

Most IDEs have a way to goto the source of a particular class, otherwise there 
is grep. The point is that many (most?) of these classes don't need to be 
modified unless in very rare cases. If it becomes a common practice to modify 
them, then there is probably something wrong with our APIs and we need to 
re-think them.

 Inner class FieldCacheSources should be refactored into their own classes
 -

 Key: SOLR-1688
 URL: https://issues.apache.org/jira/browse/SOLR-1688
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: 1.5

 Attachments: SOLR-1688.Mattmann.122609.patch.txt


 While working on SOLR-1586 I noticed that outside of class level access (or 
 package level), you can't really reference FieldCacheSources that are defined 
 inside of their FieldType constituents (e.g., in the case of StrFieldSource 
 as defined in StrField). What's more troubling is that the 
 FieldType/FieldCacheSources are defined in an inconsistent fashion: some are 
 done as inner classes, e.g., StrFieldSource and SortableFloatFieldSource, 
 while others are defined as individual classes (e.g., FloatFIeldSource). This 
 patch will make them all consistent and define each FieldCacheSource as an 
 outside class, present in o.a.solr.search.function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1688) Inner class FieldCacheSources should be refactored into their own classes

2009-12-26 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794664#action_12794664
 ] 

Shalin Shekhar Mangar commented on SOLR-1688:
-

Chris, isn't referring to it as a ValueSource instance enough for SOLR-1586?

{quote}
What's more troubling is that the FieldType/FieldCacheSources are defined in an 
inconsistent fashion: some are done as inner classes, e.g., StrFieldSource and 
SortableFloatFieldSource, while others are defined as individual classes (e.g., 
FloatFIeldSource).
{quote}

That is not really a problem. The field types are always loaded by Solr so 
whether they are an inner class or independent does not matter too much.

 Inner class FieldCacheSources should be refactored into their own classes
 -

 Key: SOLR-1688
 URL: https://issues.apache.org/jira/browse/SOLR-1688
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: indep. of env.
Reporter: Chris A. Mattmann
 Fix For: 1.5

 Attachments: SOLR-1688.Mattmann.122609.patch.txt


 While working on SOLR-1586 I noticed that outside of class level access (or 
 package level), you can't really reference FieldCacheSources that are defined 
 inside of their FieldType constituents (e.g., in the case of StrFieldSource 
 as defined in StrField). What's more troubling is that the 
 FieldType/FieldCacheSources are defined in an inconsistent fashion: some are 
 done as inner classes, e.g., StrFieldSource and SortableFloatFieldSource, 
 while others are defined as individual classes (e.g., FloatFIeldSource). This 
 patch will make them all consistent and define each FieldCacheSource as an 
 outside class, present in o.a.solr.search.function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1685) Refactor QueryComponent for easy extensibility

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)

Refactor QueryComponent for easy extensibility
--

 Key: SOLR-1685
 URL: https://issues.apache.org/jira/browse/SOLR-1685
 Project: Solr
  Issue Type: Sub-task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1685) Refactor QueryComponent for easy extensibility

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1685:


Attachment: SOLR-1685.patch

Extracted field sort value and prefetch processing into two new methods out of 
QueryComponent#process

 Refactor QueryComponent for easy extensibility
 --

 Key: SOLR-1685
 URL: https://issues.apache.org/jira/browse/SOLR-1685
 Project: Solr
  Issue Type: Sub-task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-1685.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1685) Refactor QueryComponent for easy extensibility

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1685.
-

   Resolution: Fixed
Fix Version/s: 1.5

Committed revision 893723.

 Refactor QueryComponent for easy extensibility
 --

 Key: SOLR-1685
 URL: https://issues.apache.org/jira/browse/SOLR-1685
 Project: Solr
  Issue Type: Sub-task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1685.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1686) Support fixing the number of shards in BaseDistributedTestCase

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)

Support fixing the number of shards in BaseDistributedTestCase
--

 Key: SOLR-1686
 URL: https://issues.apache.org/jira/browse/SOLR-1686
 Project: Solr
  Issue Type: Sub-task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1686) Support fixing the number of shards in BaseDistributedTestCase

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1686:


Attachment: SOLR-1686.patch

A new protected flag named fixShardCount is added which can be set to true by 
sub-classes to fix the number of shards being used for testing.

 Support fixing the number of shards in BaseDistributedTestCase
 --

 Key: SOLR-1686
 URL: https://issues.apache.org/jira/browse/SOLR-1686
 Project: Solr
  Issue Type: Sub-task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1686.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1686) Support fixing the number of shards in BaseDistributedTestCase

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1686.
-

   Resolution: Fixed
Fix Version/s: 1.5

Committed revision 893725.

 Support fixing the number of shards in BaseDistributedTestCase
 --

 Key: SOLR-1686
 URL: https://issues.apache.org/jira/browse/SOLR-1686
 Project: Solr
  Issue Type: Sub-task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1686.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-236) Field collapsing

2009-12-24 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-236:
---

Attachment: SOLR-236.patch

# Patch updated for SOLR-1685 and SOLR-1686
# The last patch had reverted changes to CollapseComponent configuration in 
solrconfig.xml and solrconfig-fieldcollapse.xml. Synced it back

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1676) spellcheck.count has confusing default and documentation

2009-12-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1676.
-

   Resolution: Fixed
Fix Version/s: 1.5

Committed revision 893700.

I've added a note in the example solrconfig.xml to refer to the wiki for 
details on the request parameters.

Thanks Daniel!

 spellcheck.count has confusing default and documentation
 

 Key: SOLR-1676
 URL: https://issues.apache.org/jira/browse/SOLR-1676
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4
Reporter: Daniel Naber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: solr-spellcheck.diff


 It seems spellcheck.count does not just limit the number of results returned, 
 as the documentation claims. Instead, this value is given to the Lucene 
 SpellChecker class which multiplies it by 10 and then only fetches the first 
 spellcheck.count*10 candidates, ignoring all others. The effect is that with 
 a low value for spellcheck.count you might miss good hits. In other words, 
 the first item with spellcheck.count==1 is not always the same item as with 
 e.g. spellcheck.count==10.
 The fix could be to fix the documentation (the comments in the sample 
 solrconfig.xml) to mention this and use a better default.
 The Lucene SpellChecker class says about the numSug parameter: Thus, you 
 should set this value to *at least* 5 for a good suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1682) Implement CollapseComponent

2009-12-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1682:


Affects Version/s: (was: 1.3)
  Summary: Implement CollapseComponent  (was: The field collapse )

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
 Fix For: 1.5

 Attachments: field-collapsing.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1680) Provide an API to specify custom Collectors

2009-12-23 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1680:


Description: 
The issue is dedicated to incorporate fieldcollapse's changes to the Solr's 
core code. 

We want to make it possible for components to specify custom Collectors in 
SolrIndexSearcher methods.

  was:Child issue of SOLR-236. The issue is dedicated to incorporate 
fieldcollapse's changes to the Solr's core code.

Summary: Provide an API to specify custom Collectors  (was: 
Fieldcollapse related changes to the core)

 Provide an API to specify custom Collectors
 ---

 Key: SOLR-1680
 URL: https://issues.apache.org/jira/browse/SOLR-1680
 Project: Solr
  Issue Type: Sub-task
  Components: search
Affects Versions: 1.3
Reporter: Martijn van Groningen
 Fix For: 1.5

 Attachments: field-collapse-core.patch


 The issue is dedicated to incorporate fieldcollapse's changes to the Solr's 
 core code. 
 We want to make it possible for components to specify custom Collectors in 
 SolrIndexSearcher methods.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2009-12-22 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793607#action_12793607
 ] 

Shalin Shekhar Mangar commented on SOLR-236:


{quote}
This is exactly the point, it's not really meta-data over the document, but on 
the group the document belongs to. And you also need a more obvious way to mark 
this document as a group representation (to distinguish it from other normal 
documents).
{quote}

We show the highest scoring document of a group, so does the fact that the 
metadata belongs to the group and not the document matter at all?

{quote}
But extending the current doc element, doesn't mean we break BWC. Adding a 
collapse-info (or collapse-meta-data) sub element to it, will certainly not 
break anything, specially when we still don't have a formal xsd for the 
responses (I know we're working on it, but it's still not out there so it's 
safe).
{quote}

We are not extending anything. We're just adding a couple of fields which may 
not exist in the index and this is a capability we plan to introduce anyway 
(however this issue does not need to depend on SOLR-1566). The response format 
remains exactly the same. There is no break in compatibility.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Distributed search test using only one shard?

2009-12-22 Thread Shalin Shekhar Mangar

On Tue, Dec 22, 2009 at 8:23 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 Looks like the recently committed SOLR-1608 accidentally changed
 this... it was nservers4 before that.


Yes, I changed it for debugging and then forgot to change it back. Sorry
about that.

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-1682) The field collapse

2009-12-22 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793957#action_12793957
 ] 

Shalin Shekhar Mangar commented on SOLR-1682:
-

Isn't this issue the same as SOLR-236? It is better to have patches in one 
place than two. Lets close this one

 The field collapse 
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Affects Versions: 1.3
Reporter: Martijn van Groningen
 Fix For: 1.5

 Attachments: field-collapsing.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2009-12-22 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793958#action_12793958
 ] 

Shalin Shekhar Mangar commented on SOLR-236:


@ttdi - Please post your questions to solr-user mailing list. This issue is 
strictly for Solr related development (not usage).

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1676) spellcheck.count has confusing default and documentation

2009-12-21 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1676:
---

Assignee: Shalin Shekhar Mangar

 spellcheck.count has confusing default and documentation
 

 Key: SOLR-1676
 URL: https://issues.apache.org/jira/browse/SOLR-1676
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4
Reporter: Daniel Naber
Assignee: Shalin Shekhar Mangar
Priority: Minor

 It seems spellcheck.count does not just limit the number of results returned, 
 as the documentation claims. Instead, this value is given to the Lucene 
 SpellChecker class which multiplies it by 10 and then only fetches the first 
 spellcheck.count*10 candidates, ignoring all others. The effect is that with 
 a low value for spellcheck.count you might miss good hits. In other words, 
 the first item with spellcheck.count==1 is not always the same item as with 
 e.g. spellcheck.count==10.
 The fix could be to fix the documentation (the comments in the sample 
 solrconfig.xml) to mention this and use a better default.
 The Lucene SpellChecker class says about the numSug parameter: Thus, you 
 should set this value to *at least* 5 for a good suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1676) spellcheck.count has confusing default and documentation

2009-12-21 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793158#action_12793158
]

Shalin Shekhar Mangar commented on SOLR-1676:
-

Although it is not documented anywhere, SpellCheckComponent passes
max(spellcheck.count, 5) to the Lucene spellchecker, see
AbstractLuceneSpellChecker line 141 in trunk.

bq. The effect is that with a low value for spellcheck.count you might miss
good hits. In other words, the first item with spellcheck.count==1 is not
always the same item as with e.g. spellcheck.count==10.

That is true. It is a trade-off between accuracy and performance. We cannot
avoid this without fetching all results (or a large number of them) internally
and score all of them with a distance metric and that can make it very slow.

Do you have any suggestion on how we could improve the documentation?

spellcheck.count has confusing default and documentation

Key: SOLR-1676
URL: https://issues.apache.org/jira/browse/SOLR-1676
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4
Reporter: Daniel Naber
Priority: Minor

It seems spellcheck.count does not just limit the number of results returned,
as the documentation claims. Instead, this value is given to the Lucene
SpellChecker class which multiplies it by 10 and then only fetches the first
spellcheck.count*10 candidates, ignoring all others. The effect is that with
a low value for spellcheck.count you might miss good hits. In other words,
the first item with spellcheck.count==1 is not always the same item as with
e.g. spellcheck.count==10.
The fix could be to fix the documentation (the comments in the sample
solrconfig.xml) to mention this and use a better default.
The Lucene SpellChecker class says about the numSug parameter: Thus, you
should set this value to *at least* 5 for a good suggestion.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1674) improve analysis tests, cut over to new API

2009-12-21 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793174#action_12793174
 ] 

Shalin Shekhar Mangar commented on SOLR-1674:
-

All tests pass after renaming protWords.txt to protwords.txt. Unfortunately, 
this is too big to review in detail right now but I trust Robert to do the 
right thing :)

bq. If there are no objections I will commit this beautiful addition to our 
analysis tests soon.

+1

 improve analysis tests, cut over to new API
 ---

 Key: SOLR-1674
 URL: https://issues.apache.org/jira/browse/SOLR-1674
 Project: Solr
  Issue Type: Test
  Components: Schema and Analysis
Reporter: Robert Muir
 Attachments: SOLR-1674.patch, SOLR-1674.patch


 This patch
 * converts all analysis tests to use the new tokenstream api
 * converts most tests to use the more stringent assertion mechanisms from 
 lucene
 * adds new tests to improve coverage
 Most bugs found by more stringent testing have been fixed, with the exception 
 of SynonymFilter.
 The problems with this filter are more serious, the previous tests were 
 essentially a no-op.
 The new tests for SynonymFilter test the current behavior, but have FIXMEs 
 with what I think the old test wanted to expect in the comments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1676) spellcheck.count has confusing default and documentation

2009-12-21 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793277#action_12793277
 ] 

Shalin Shekhar Mangar commented on SOLR-1676:
-

I guess it is better to add this information to the SpellCheckComponent wiki 
page and reference that in the example solrconfig.xml. Anybody using 
SpellCheckComponent would anyway need to refer to the wiki to figure out the 
other parameters.

http://wiki.apache.org/solr/SpellCheckComponent

 spellcheck.count has confusing default and documentation
 

 Key: SOLR-1676
 URL: https://issues.apache.org/jira/browse/SOLR-1676
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4
Reporter: Daniel Naber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Attachments: solr-spellcheck.diff


 It seems spellcheck.count does not just limit the number of results returned, 
 as the documentation claims. Instead, this value is given to the Lucene 
 SpellChecker class which multiplies it by 10 and then only fetches the first 
 spellcheck.count*10 candidates, ignoring all others. The effect is that with 
 a low value for spellcheck.count you might miss good hits. In other words, 
 the first item with spellcheck.count==1 is not always the same item as with 
 e.g. spellcheck.count==10.
 The fix could be to fix the documentation (the comments in the sample 
 solrconfig.xml) to mention this and use a better default.
 The Lucene SpellChecker class says about the numSug parameter: Thus, you 
 should set this value to *at least* 5 for a good suggestion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: SOLR 1.4 debian packaging

2009-12-21 Thread Shalin Shekhar Mangar

On Tue, Dec 22, 2009 at 4:15 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : @solr-dev: Could sbd. from upstream help us out with a working
 tomcat.policy
 : for solr? For now I just granted all permissions to solr.

  * solr needs read for the conf directory.


With the new Java based replication in Solr 1.4, people who need
configuration replication need read/write access for the conf directory.


-- 
Regards,
Shalin Shekhar Mangar.

Re: $Id$

2009-12-20 Thread Shalin Shekhar Mangar

On Sun, Dec 20, 2009 at 10:42 PM, Mark Miller markrmil...@gmail.com wrote:

 Robert Muir wrote:
  Hello, I am wondering why we are using $Id$ in solr?
 
  To me it only seems this causes problems with applying patches (it is
  causing Mark a problem right now).
 
  I am trying to see how it is helpful? there are other ways to see the svn
  history that do not cause problems with patches
 
 
 +1 on giving them the boot - we decided the same thing in Lucene - who
 needs them when they cause these problems and offer little to nothing in
 return. And I can't count how many patches I've had to hand fix ...


I agree. It causes problems with patches and I don't see the benefit of
using them in class javadocs. Though they are sometimes useful in the
statistics section (for SolrInfoMBean)

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-236) Field collapsing

2009-12-20 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793101#action_12793101
 ] 

Shalin Shekhar Mangar commented on SOLR-236:


How about we change the current field collapsing response format to the 
following?

We add new well-known fields to the document itself, say 
# collapse.value - contains the group field's value for this document
# collapse.count - the number of results collapsed under this document
# collapse.aggregate.function(field-name) - the aggregate value for the given 
function applied to the given field for this document's group

Example:
{code:xml}
?xml version=1.0 encoding=UTF-8?
response
  lst name=responseHeader
int name=status0/int
int name=QTime2/int
lst name=params
  str name=collapse.fieldmanu_exact/str
  str name=collapse.aggregatemax(field1)/str
  str name=collapse.aggregateavg(field1)/str
  str name=qtitle:test/str
  str name=field.collapsetitle/str
  str name=qtcollapse/str
/lst
  /lst
  result name=response numFound=30 start=0
doc
  str name=idF8V7067-APL-KIT/str
  str name=collapse.valueBelkin/str
  int name=collapse.count1/int
  int name=collapse.aggregate.max(field1)100/int
  float name=collapse.aggregate.avg(field1)50.0/float
/doc
doc
  str name=idTWINX2048-3200PRO/str
  str name=collapse.valueCorsair Microsystems Inc./str
  int name=collapse.count3/int
  int name=collapse.aggregate.max(field1)100/int
  float name=collapse.aggregate.avg(field1)50.0/float
/doc
  /result
/response
{code}

No need to have another section and correlate based on uniqueKeys. For this to 
work, CollapseComponent must generate a custom SolrDocumentList and set it as 
results in the response.

For request parameters:
# collapse.aggregate - Can we make this a multi-valued parameter instead of 
comma separated?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-236) Field collapsing

2009-12-18 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-236:
---

Attachment: SOLR-236.patch

Changes:

# Modified configuration as Noble suggested. The 
AggregateCollapseCollectorFactory is now PluginInfoInitialized instead of 
NamedListInitialzed and functions are plugins. The name attribute is removed 
from collapseCollectorFactory since it is no longer necessary:
{code:xml}
searchComponent name=collapse 
class=org.apache.solr.handler.component.CollapseComponent
collapseCollectorFactory 
class=solr.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory 
/

collapseCollectorFactory 
class=solr.fieldcollapse.collector.FieldValueCountCollapseCollectorFactory /

collapseCollectorFactory 
class=solr.fieldcollapse.collector.DocumentFieldsCollapseCollectorFactory /

collapseCollectorFactory 
class=org.apache.solr.search.fieldcollapse.collector.AggregateCollapseCollectorFactory
  function name=sum 
class=org.apache.solr.search.fieldcollapse.collector.aggregate.SumFunction/
  function name=avg 
class=org.apache.solr.search.fieldcollapse.collector.aggregate.AverageFunction/
  function name=min 
class=org.apache.solr.search.fieldcollapse.collector.aggregate.MinFunction/
  function name=max 
class=org.apache.solr.search.fieldcollapse.collector.aggregate.MaxFunction/
/collapseCollectorFactory

fieldCollapseCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=128/

  /searchComponent
{code}
# Changed DistributedFieldCollapsingIntegrationTest to use 
BaseDistributedSearchTestCase. This fails right now. I believe there is a bug 
with the distributed implementation. The distributed version returns one extra 
group when compared to the non-distributed version. I've put an @Ignore 
annotation on that test.

We can consider creating the functions through a factory so that they can 
accept initialization parameters. The schema-fieldcollapse.xml and 
solrconfig-fieldcollapse.xml are no longer necessary and can be removed.

Next steps:
# Let us open issues for all the modifications needed in Solr to support this 
feature. That will help us break down this patch into more manageable (and 
easily reviewable) pieces. I guess we need one for providing custom Collectors 
for SolrIndexSearcher methods. Any others?
# The response format is not very clear in the wiki. We should add more 
examples and explain the format.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch

[jira] Resolved: (SOLR-1667) PatternTokenizer does not clearAttributes()

2009-12-18 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1667.
-

   Resolution: Fixed
Fix Version/s: 1.5
 Assignee: Shalin Shekhar Mangar

Committed revision 892217.

Thanks Robert!

 PatternTokenizer does not clearAttributes()
 ---

 Key: SOLR-1667
 URL: https://issues.apache.org/jira/browse/SOLR-1667
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Robert Muir
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1667.patch


 PatternTokenizer creates tokens, but never calls clearAttributes()
 because of this things like positionIncrementGap are never reset to their 
 default value.
 trivial patch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1673) Getting the list of terms from more than one field

2009-12-18 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792471#action_12792471
 ] 

Shalin Shekhar Mangar commented on SOLR-1673:
-

Why would that be better? The current way is how http params are supposed to be 
if multiple values are not ordered.

 Getting the list of terms from more than one field
 --

 Key: SOLR-1673
 URL: https://issues.apache.org/jira/browse/SOLR-1673
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 1.4
 Environment: Operating system - Linux (Archlinux)
 Servlet container - Jetty
Reporter: Siddhant Goel
Priority: Minor
 Fix For: 1.5


 To get the list of terms from more than one field, its currently required to 
 specify the fields as terms.fl=field1terms.fl=field2terms.fl=field3, and so 
 on.
 It would be better if the syntax can be modified to something like 
 terms.fl=field1,field2,field3.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-236) Field collapsing

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-236:
--

Assignee: Shalin Shekhar Mangar

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1660) capitalizationfilter crashes if you use the maxWordCountOption

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1660:
---

Assignee: Shalin Shekhar Mangar

 capitalizationfilter crashes if you use the maxWordCountOption
 --

 Key: SOLR-1660
 URL: https://issues.apache.org/jira/browse/SOLR-1660
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Robert Muir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-1660.patch


 because arrayCopys into null.
 if you want a testcase i can yank it out of in-progress patch from SOLR-1657, 
 but i think its obvious.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1660) capitalizationfilter crashes if you use the maxWordCountOption

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1660.
-

   Resolution: Fixed
Fix Version/s: 1.5

Committed revision 891596.

Thanks Robert!

 capitalizationfilter crashes if you use the maxWordCountOption
 --

 Key: SOLR-1660
 URL: https://issues.apache.org/jira/browse/SOLR-1660
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Robert Muir
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1660.patch


 because arrayCopys into null.
 if you want a testcase i can yank it out of in-progress patch from SOLR-1657, 
 but i think its obvious.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1662) BufferedTokenStream incorrect cloning

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791866#action_12791866
 ] 

Shalin Shekhar Mangar commented on SOLR-1662:
-

{quote}
So if we decide its the responsibility of the subclass, these implementations 
need thorough tests to see if they are ok or not.
If we add the cloning to BufferedTokenStream itself, then we know they are ok...
{quote}

I think cloning should be done by sub-classes before writing. If 
BufferedTokenStream clones the token then every sub-class pays the price even 
though the use-case may just be to throw the token away.

 BufferedTokenStream incorrect cloning
 -

 Key: SOLR-1662
 URL: https://issues.apache.org/jira/browse/SOLR-1662
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Robert Muir

 As part of writing tests for SOLR-1657, I rewrote one of the base classes 
 (BaseTokenTestCase) to use the new TokenStream API, but also with some 
 additional safety.
 {code}
  public static String tsToString(TokenStream in) throws IOException {
 StringBuilder out = new StringBuilder();
 TermAttribute termAtt = (TermAttribute) 
 in.addAttribute(TermAttribute.class);
 // extra safety to enforce, that the state is not preserved and also
 // assign bogus values
 in.clearAttributes();
 termAtt.setTermBuffer(bogusTerm);
 while (in.incrementToken()) {
   if (out.length()  0)
 out.append(' ');
   out.append(termAtt.term());
   in.clearAttributes();
   termAtt.setTermBuffer(bogusTerm);
 }
 in.close();
 return out.toString();
   }
 {code}
 Setting the term text to bogus values helps find bugs in tokenstreams that do 
 not clear or clone properly. In this case there is a problem with a 
 tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B - A A 
 B but does not clone, so the values get overwritten.
 This can be fixed in two ways: 
 * BufferedTokenStream does the cloning
 * subclasses are responsible for the cloning
 The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1662) BufferedTokenStream incorrect cloning

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1662:
---

Assignee: Shalin Shekhar Mangar

 BufferedTokenStream incorrect cloning
 -

 Key: SOLR-1662
 URL: https://issues.apache.org/jira/browse/SOLR-1662
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Robert Muir
Assignee: Shalin Shekhar Mangar

 As part of writing tests for SOLR-1657, I rewrote one of the base classes 
 (BaseTokenTestCase) to use the new TokenStream API, but also with some 
 additional safety.
 {code}
  public static String tsToString(TokenStream in) throws IOException {
 StringBuilder out = new StringBuilder();
 TermAttribute termAtt = (TermAttribute) 
 in.addAttribute(TermAttribute.class);
 // extra safety to enforce, that the state is not preserved and also
 // assign bogus values
 in.clearAttributes();
 termAtt.setTermBuffer(bogusTerm);
 while (in.incrementToken()) {
   if (out.length()  0)
 out.append(' ');
   out.append(termAtt.term());
   in.clearAttributes();
   termAtt.setTermBuffer(bogusTerm);
 }
 in.close();
 return out.toString();
   }
 {code}
 Setting the term text to bogus values helps find bugs in tokenstreams that do 
 not clear or clone properly. In this case there is a problem with a 
 tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B - A A 
 B but does not clone, so the values get overwritten.
 This can be fixed in two ways: 
 * BufferedTokenStream does the cloning
 * subclasses are responsible for the cloning
 The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-236) Field collapsing

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-236:
---

Attachment: SOLR-236.patch

Patch in sync with trunk.

# CollapseComponent is PluginInfoInitialized. Removed changes to SolrConfig. 
Note, the collapseCollectorFactories array and the separate fieldCollapsing 
element has been removed from configuration.  this patch has the following 
configuration:
{code:xml}
searchComponent name=collapse 
class=org.apache.solr.handler.component.CollapseComponent
collapseCollectorFactory name=groupDocumentsCounts 
class=solr.fieldcollapse.collector.DocumentGroupCountCollapseCollectorFactory 
/

collapseCollectorFactory name=groupFieldValue 
class=solr.fieldcollapse.collector.FieldValueCountCollapseCollectorFactory /

collapseCollectorFactory name=groupDocumentsFields 
class=solr.fieldcollapse.collector.DocumentFieldsCollapseCollectorFactory /

collapseCollectorFactory name=groupAggregatedData 
class=org.apache.solr.search.fieldcollapse.collector.AggregateCollapseCollectorFactory
lst name=aggregateFunctions
str 
name=sumorg.apache.solr.search.fieldcollapse.collector.aggregate.SumFunction/str
str 
name=avgorg.apache.solr.search.fieldcollapse.collector.aggregate.AverageFunction/str
str 
name=minorg.apache.solr.search.fieldcollapse.collector.aggregate.MinFunction/str
str 
name=maxorg.apache.solr.search.fieldcollapse.collector.aggregate.MaxFunction/str
/lst
/collapseCollectorFactory

   fieldCollapseCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=128/
  /searchComponent
{code}

# I couldn't find where the fieldCollapseCache was being regenerated. It seems 
it is not being thrown away after commits? I have changed it to be re-created 
on newSearcher event.
# Removed changes to JettySolrRunner,CoreContainer and SolrDispatchFilter for 
the distributed test case. We will refactor it to use 
BaseDistributedSearchTestCase (not implemented yet)

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792115#action_12792115
 ] 

Shalin Shekhar Mangar commented on SOLR-236:


{quote}
I'd define large scale for this in a couple of ways:
1. Lots of docs in the result set (10K+)
2. Lots of overall docs (100M+)
3. Lots of queries ( 10 QPS) 
{quote}

Grant, this patch may not be perfect but I think we all agree that it is a 
great start. This is stable, used by many and has been well supported by the 
community. This is also a large patch and as I have known from my 
DataImportHandler experience, maintaining a large patch is quite a pain (and 
DataImportHandler didn't even touch the core). How about we commit this (after 
some review, of course), mark this as experimental (no guarantees of any sort) 
and then start improving it one issue at a time? Alternately, if you are not 
comfortable adding it to trunk, we can commit this on a branch and merge into 
trunk later.

What do you think?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1662) BufferedTokenStream incorrect cloning

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1662.
-

Resolution: Fixed

Committed revision 891889.

Thanks Robert and Uwe!

 BufferedTokenStream incorrect cloning
 -

 Key: SOLR-1662
 URL: https://issues.apache.org/jira/browse/SOLR-1662
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Robert Muir
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-1662.patch


 As part of writing tests for SOLR-1657, I rewrote one of the base classes 
 (BaseTokenTestCase) to use the new TokenStream API, but also with some 
 additional safety.
 {code}
  public static String tsToString(TokenStream in) throws IOException {
 StringBuilder out = new StringBuilder();
 TermAttribute termAtt = (TermAttribute) 
 in.addAttribute(TermAttribute.class);
 // extra safety to enforce, that the state is not preserved and also
 // assign bogus values
 in.clearAttributes();
 termAtt.setTermBuffer(bogusTerm);
 while (in.incrementToken()) {
   if (out.length()  0)
 out.append(' ');
   out.append(termAtt.term());
   in.clearAttributes();
   termAtt.setTermBuffer(bogusTerm);
 }
 in.close();
 return out.toString();
   }
 {code}
 Setting the term text to bogus values helps find bugs in tokenstreams that do 
 not clear or clone properly. In this case there is a problem with a 
 tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B - A A 
 B but does not clone, so the values get overwritten.
 This can be fixed in two ways: 
 * BufferedTokenStream does the cloning
 * subclasses are responsible for the cloning
 The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2009-12-17 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792350#action_12792350
 ] 

Shalin Shekhar Mangar commented on SOLR-236:


For Martijn:

{quote}
The reason I added fieldCollapsing ... /fieldCollapsing was to be able 
support sharing of collapseCollectorFactory instances between different 
collapse components in the near future. You think that is a valid reason for 
that? Or do you think that collapseCollectorFactories shouldn't be shared?
{quote}

I just don't think that we should introduce new tags and new kinds of 
components in solrconfig.xml, particularly those that are useful to only a 
single component. That introduces changes in SolrConfig.java so that it knows 
how to load such things. That is why I moved that configuration inside 
CollapseComponent. Ideally, all components will use PluginInfo and load 
whatever they need from their own PluginInfo object and SolrConfig would not 
need to be changed unless we introduce new kinds of Solr plugins.

Just curious, what would be a use-case for sharing factories (other than 
reducing duplication of configuration) and having multiple CollapseComponent?

{quote}
The CollapseComponentTest was failing. The field collapseCollectorFactories in 
CollapseComponent was null when not specifying any collapse collector factories 
in the solrconfig.xml which resulted in a NPE.
{quote}

Oops, sorry about that. I only ran the tests inside 
org.apache.solr.search.fieldcollapse. I didn't notice there are other tests 
too. Thanks!

bq. The DistributedFieldCollapsingIntegrationTest is still failing, because you 
left out changes in JettySolrRunner, CoreContainer and SolrDispatchFilter from 
my original patch.

I don't think we need to add that functionality to CoreContainer and 
SolrDispatchFilter. It is still possible to specify a different solrconfig and 
schema for a test. Let me see if I can make this work with 
BaseDistributedSearchTestCase

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1630) StringIndexOutOfBoundsException in SpellCheckComponent

2009-12-16 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1630:


Attachment: SOLR-1630.patch

I'm not able to reproduce this issue. I used Robin's document, schema and 
solrconfig.xml in the form of a unit test and it gives an empty spell check 
response but no exceptions.

 StringIndexOutOfBoundsException in SpellCheckComponent
 --

 Key: SOLR-1630
 URL: https://issues.apache.org/jira/browse/SOLR-1630
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis, spellchecker
Affects Versions: 1.4
 Environment: Solr 1.4
 Lucene 2.9.1
 Win XP
 java version 1.6.0_14
Reporter: Robin Wojciki
Assignee: Shalin Shekhar Mangar
 Attachments: bug.xml, schema.xml, SOLR-1630.patch, solrconfig.xml


 For some documents/search strings, the SpellCheckComponent throws 
 StringIndexOutOfBoundsException
 See: http://www.lucidimagination.com/search/document/3be6555227e031fc/
 h2. Replication
  * Save attached schema.xml and solrconfig.xml in 
 apache-solr-1.4.0/example/solr/conf
  * Start Solr
  * Index attached bug.xml
  * Query [http://localhost:8983/solr/select/?q=awehjse-wjkekw]
 It throws a StringIndexOutOfBoundsException
 {noformat} String index out of range: -7
 java.lang.StringIndexOutOfBoundsException: String index out of range: -7
   at java.lang.AbstractStringBuilder.replace(Unknown Source)
   at java.lang.StringBuilder.replace(Unknown Source)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143)
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1630) StringIndexOutOfBoundsException in SpellCheckComponent

2009-12-16 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791342#action_12791342
 ] 

Shalin Shekhar Mangar commented on SOLR-1630:
-

Thanks Guillaume, can you give me an example document too?

 StringIndexOutOfBoundsException in SpellCheckComponent
 --

 Key: SOLR-1630
 URL: https://issues.apache.org/jira/browse/SOLR-1630
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis, spellchecker
Affects Versions: 1.4
 Environment: Solr 1.4
 Lucene 2.9.1
 Win XP
 java version 1.6.0_14
Reporter: Robin Wojciki
Assignee: Shalin Shekhar Mangar
 Attachments: bug.xml, schema.xml, SOLR-1630.patch, solrconfig.xml, 
 spellcheckconfig.xml


 For some documents/search strings, the SpellCheckComponent throws 
 StringIndexOutOfBoundsException
 See: http://www.lucidimagination.com/search/document/3be6555227e031fc/
 h2. Replication
  * Save attached schema.xml and solrconfig.xml in 
 apache-solr-1.4.0/example/solr/conf
  * Start Solr
  * Index attached bug.xml
  * Query [http://localhost:8983/solr/select/?q=awehjse-wjkekw]
 It throws a StringIndexOutOfBoundsException
 {noformat} String index out of range: -7
 java.lang.StringIndexOutOfBoundsException: String index out of range: -7
   at java.lang.AbstractStringBuilder.replace(Unknown Source)
   at java.lang.StringBuilder.replace(Unknown Source)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143)
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2009-12-16 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791836#action_12791836
 ] 

Shalin Shekhar Mangar commented on SOLR-236:


Does anybody have a reason for why this should not be committed to trunk as it 
stands right now?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-17) XSD for solr requests/responses

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790621#action_12790621
 ] 

Shalin Shekhar Mangar commented on SOLR-17:
---

This is like a solution looking for a problem.

 XSD for solr requests/responses
 ---

 Key: SOLR-17
 URL: https://issues.apache.org/jira/browse/SOLR-17
 Project: Solr
  Issue Type: Improvement
Reporter: Mike Baranczak
Priority: Minor
 Attachments: solr-complex.xml, solr-rev2.xsd, solr.xsd, 
 UselessRequestHandler.java


 Attaching an XML schema definition for the responses and the update requests. 
 I needed to do this for myself anyway, so I might as well contribute it to 
 the project.
 At the moment, I have no plans to write an XSD for the config documents, but 
 it wouldn't be a bad idea.
 TODO: change the schema URL. I'm guessing that Apache already has some sort 
 of naming convention for these?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1006) ConcurrentLRUCache API improvements

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790644#action_12790644
 ] 

Shalin Shekhar Mangar edited comment on SOLR-1006 at 12/15/09 10:18 AM:


I don't have a use-case for this anymore. Let us close this issue.

  was (Author: shalinmangar):
I don't have a a use-case for this anymore. Let us close this issue.
  
 ConcurrentLRUCache API improvements
 ---

 Key: SOLR-1006
 URL: https://issues.apache.org/jira/browse/SOLR-1006
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1006.patch, SOLR-1006.patch


 This is to make ConcurrentLRUCache more consistent with LinkedHashMap behavior
 # remove must not call evictionListener.evictedEntry()
 # -EvictionListener must be able prevent eviction of an element by returning 
 a false.-
 # Add a new method Map getOldestItems(long n)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1006) ConcurrentLRUCache API improvements

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1006:


Description: 
This is to make ConcurrentLRUCache more consistent with LinkedHashMap behavior

# remove must not call evictionListener.evictedEntry()
# -EvictionListener must be able prevent eviction of an element by returning a 
false.-
# Add a new method Map getOldestItems(long n)

  was:
This is to make ConcurrentLRUCache more consistent with LinkedHashMap behavior

# remove must not call evictionListener.evictedEntry()
# EvictionListener must be able prevent eviction of an element by returning a 
false.
# Add a new method Map getOldestItems(long n)


I don't have a a use-case for this anymore. Let us close this issue.

 ConcurrentLRUCache API improvements
 ---

 Key: SOLR-1006
 URL: https://issues.apache.org/jira/browse/SOLR-1006
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1006.patch, SOLR-1006.patch


 This is to make ConcurrentLRUCache more consistent with LinkedHashMap behavior
 # remove must not call evictionListener.evictedEntry()
 # -EvictionListener must be able prevent eviction of an element by returning 
 a false.-
 # Add a new method Map getOldestItems(long n)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (SOLR-1006) ConcurrentLRUCache API improvements

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar closed SOLR-1006.
---

   Resolution: Fixed
Fix Version/s: (was: 1.5)
   1.4

 ConcurrentLRUCache API improvements
 ---

 Key: SOLR-1006
 URL: https://issues.apache.org/jira/browse/SOLR-1006
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1006.patch, SOLR-1006.patch


 This is to make ConcurrentLRUCache more consistent with LinkedHashMap behavior
 # remove must not call evictionListener.evictedEntry()
 # -EvictionListener must be able prevent eviction of an element by returning 
 a false.-
 # Add a new method Map getOldestItems(long n)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1645) Add human content-type

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1645:


Fix Version/s: (was: 1.4)
   1.5

1.4 has been released. Marking for 1.5 instead.

 Add human content-type
 --

 Key: SOLR-1645
 URL: https://issues.apache.org/jira/browse/SOLR-1645
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 1.4
Reporter: Khalid Yagoubi
 Fix For: 1.5


 Idea is to allow Solr-Cell to calculate the human content-type from the 
 extracted content-type and map it to a field in the schema. 
 So the user can search on media: image or media:video
 Idea :
 1) Hardcode a hashmap in somewhere in extraction classes and get human 
 content-type from extracted content-type. I Think to SolrContentHandler.java
 2) Write an xml file where we can put a mapping like in tika-config.xml for 
 parsers
 3) Use tika-config.xml to get all supported mime-types

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1212) TestNG Test Case

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790648#action_12790648
 ] 

Shalin Shekhar Mangar commented on SOLR-1212:
-

I'm not sure what to do with this. We don't need to ship this with our 
releases. Perhaps it is best to mark this as Won't Fix and link this issue to 
http://wiki.apache.org/solr/TestingSolr so that people who use TestNG can use 
this code if necessary.

 TestNG Test Case 
 -

 Key: SOLR-1212
 URL: https://issues.apache.org/jira/browse/SOLR-1212
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.4
 Environment: Java 6
Reporter: Kay Kay
 Fix For: 1.5

 Attachments: SOLR-1212.patch, testng-5.9-jdk15.jar

   Original Estimate: 1h
  Remaining Estimate: 1h

 TestNG equivalent of AbstractSolrTestCase , without using JUnit altogether . 
 New Class created: AbstractSolrNGTest 
 LICENSE.txt , NOTICE.txt modified as appropriate. ( TestNG under Apache 
 License 2.0 ) 
 TestNG 5.9-jdk15 added to lib. 
 Justification:  In some workplaces - people are moving towards TestNG and 
 take out JUnit altogether from the classpath. Hence useful in those cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-630) Spellchecker should not be case-sensitive and should be stopwords-aware

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-630.


Resolution: Invalid

I don't think this is a problem. As Alex noted, it is all a matter of 
configuring your analyzers and spell checker correctly.

 Spellchecker should not be case-sensitive and should be stopwords-aware
 ---

 Key: SOLR-630
 URL: https://issues.apache.org/jira/browse/SOLR-630
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Reporter: Otis Gospodnetic
Priority: Minor
 Fix For: 1.5


 Here are 2 more bugs:
 1)
 Search for:
   united states of America
 Suggests:
  united states oft America
 It looks like the SC doesn't check stopwords, and of is a stopword.  Thus, 
 it does not exist in the index,
 but oft does, so SC suggests oft and thinks of is misspelled.  I think 
 the SC component should check the list of
 stopwords, too, no?
 2)
 Search for:
  united states of America
 Suggests:
  united states oftAmericaa
 The of-oft is described above.  But note how SC suggested America-Americaa, 
 but it didn't do that for america.
 This looks like case-sensitivity problem.  Shouldn't the SC be 
 case-insensitive?
 I can't produce a patch now (no src handy), so I'm hoping Grant or somebody 
 else can do it based on this report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1532) allow StreamingUpdateSolrServer to use a provided HttpClient

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1532:


Attachment: SOLR-1532.patch

Synced to trunk. I'll commit this shortly.

 allow StreamingUpdateSolrServer to use a provided HttpClient
 

 Key: SOLR-1532
 URL: https://issues.apache.org/jira/browse/SOLR-1532
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.4
Reporter: gabriele renzi
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1532.patch, SOLR-1532.patch


 As of r830319 StreamingUpdateSolrServer does not allow calling code to 
 provide an HttpClient, and this implies client code cannot reuse an existing 
 connection manager, the patch  adds a new constructor and refactors the old 
 one to use this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1532) allow StreamingUpdateSolrServer to use a provided HttpClient

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1532.
-

Resolution: Fixed
  Assignee: Shalin Shekhar Mangar

Committed revision 890769.

Thanks Gabriele!

 allow StreamingUpdateSolrServer to use a provided HttpClient
 

 Key: SOLR-1532
 URL: https://issues.apache.org/jira/browse/SOLR-1532
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.4
Reporter: gabriele renzi
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1532.patch, SOLR-1532.patch


 As of r830319 StreamingUpdateSolrServer does not allow calling code to 
 provide an HttpClient, and this implies client code cannot reuse an existing 
 connection manager, the patch  adds a new constructor and refactors the old 
 one to use this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1131:


Attachment: SOLR-1131.patch

I guess Noble was referring to something like what is done in this patch.

# DelegatingFieldType has a new method:
{code}
public SchemaField[] getSubFields(SchemaField mainField);
{code}
# PointType and PlusMinusField implement this new method. It is not the 
prettiest way but this is one way to do it.
# With this approach, we can get the names from the subFields wherever the name 
is used (not implemented in this patch).

The PlusMinusField is actually a field type and not a field so we should 
probably rename it to PlusMinusFieldType.

 Allow a single field type to index multiple fields
 --

 Key: SOLR-1131
 URL: https://issues.apache.org/jira/browse/SOLR-1131
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Ryan McKinley
Assignee: Grant Ingersoll
 Fix For: 1.5

 Attachments: SOLR-1131-IndexMultipleFields.patch, 
 SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, 
 SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, 
 SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, 
 SOLR-1131.patch, SOLR-1131.patch


 In a few special cases, it makes sense for a single field (the concept) to 
 be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The 
 concept point may be best indexed in a variety of ways:
  * geohash (sincle lucene field)
  * lat field, lon field (two double fields)
  * cartesian tiers (a series of fields with tokens to say if it exists within 
 that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: NPE in MoreLikeThis referenced doc not found and debugQuery=True

2009-12-15 Thread Shalin Shekhar Mangar

On Thu, Dec 10, 2009 at 6:34 PM, david.stu...@progressivealliance.co.uk 
david.stu...@progressivealliance.co.uk wrote:

 Hi All,

 When I do a specific MLT search on a document with debugQuery=True I am
 getting
 a NullPoniterException both on screen and in my catalina logs. The query is
 as
 follows


 http://localhost:8080/solr2/select/?mlt.minwl=3mlt.fl=bodymlt.mintf=1mlt.maxwl=15mlt.maxqt=20version=1.2rows=5mlt.mindf=1fl=nid,title,path,url,digest,teaserstart=0q=nid:16036qt=mltdebugQuery=true

 Is this desired behavior?

 java.lang.RuntimeException: java.lang.NullPointerException
 at org.apache.solr.search.QueryParsing.toString(QueryParsing.java:470)
 at

 org.apache.solr.util.SolrPluginUtils.doStandardDebug(SolrPluginUtils.java:399)
 at

 org.apache.solr.handler.MoreLikeThisHandler.handleRequestBody(MoreLikeThisHandle
 r.java:189)
 at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java
 :131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
 at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
 at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilt
 erChain.java:235)
 at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.
 java:206)
 at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:2
 33)
 at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:1
 91)
 at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109
 )
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
 at

 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Pr
 otocol.java:583)
 at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
 at java.lang.Thread.run(Thread.java:637)
 Caused by: java.lang.NullPointerException
 at org.apache.solr.search.QueryParsing.toString(QueryParsing.java:439)
 at org.apache.solr.search.QueryParsing.toString(QueryParsing.java:467)
 ... 18 more


 Apologies if this has been discussed or deemed desired, but thought I would
 mention this and offer a patch as a entry into helping with the project.


Thanks for reporting this Dave. It'd be great if you can open a Jira issue
and attach a unit test reproducing this issue. A fix would be even better :)

http://wiki.apache.org/solr/HowToContribute

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Commented: (SOLR-17) XSD for solr requests/responses

2009-12-15 Thread Shalin Shekhar Mangar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790910#action_12790910
]

Shalin Shekhar Mangar commented on SOLR-17:
---

Chris, it seems that you are taking my comment personally. Please don't; it is
not my intention to ridicule anyone's efforts.

As you can see, this issue has been open for some time now and a major reason
is that we have never found a good use for an XSD. I'm merely trying to say
that it seems like we're trying to _find_ use-cases for a solution instead of
starting with an actual need.

My point is that Solr can use it we _want_ to but Solr certainly does not
_need_ to use it. I don't think we gain much by an XSD.

XSD for solr requests/responses
---

Key: SOLR-17
URL: https://issues.apache.org/jira/browse/SOLR-17
Project: Solr
Issue Type: Improvement
Reporter: Mike Baranczak
Priority: Minor
Attachments: solr-complex.xml, solr-rev2.xsd, solr.xsd,
UselessRequestHandler.java

Attaching an XML schema definition for the responses and the update requests.
I needed to do this for myself anyway, so I might as well contribute it to
the project.
At the moment, I have no plans to write an XSD for the config documents, but
it wouldn't be a bad idea.
TODO: change the schema URL. I'm guessing that Apache already has some sort
of naming convention for these?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: ValueSourceParser problem

2009-12-15 Thread Shalin Shekhar Mangar

On Wed, Dec 16, 2009 at 11:01 AM, patrick o'leary pj...@pjaol.com wrote:


 #2 There's an AbstractMethodError when you extend ValueSourceParser and
 don't override the init(NamedList args) method
 because SolrCore:~439 createInitInstance, cast's the plugin class as a
 NamedListInitializedPlugin, and call's
 ((NamedListInitializedPlugin) o).init(info.initArgs);

 If your extended ValueSourceParser class doesn't provide an override, then
 there's nothing that implements the base interface from
 NamedListInitializedPlugin.


ValueSourceParser in trunk has an empty init method so you should never get
a AbstractMethodError. Can you check again?

-- 
Regards,
Shalin Shekhar Mangar.

Re: ValueSourceParser problem

2009-12-15 Thread Shalin Shekhar Mangar

On Wed, Dec 16, 2009 at 11:32 AM, patrick o'leary pj...@pjaol.com wrote:

 Check SolrCore.createInitInstance
 It cast's your CustomValueSourceParser as a NamedListInitializedPlugin
 which
 is an interface,
 thus the AbstractMethodError, as there isn't a concrete implementation of
 init.

 If it cast it as a ValueSourceParser in SolrCore then it would be fine.


That is not possible. Even though the object is cast to an interface
NamedListInitializedPlugin, it is still an instance of ValueSourceParser and
therefore it does have an implementation of the init method. Am I missing
something?

-- 
Regards,
Shalin Shekhar Mangar.

Re: ValueSourceParser problem

2009-12-15 Thread Shalin Shekhar Mangar

On Wed, Dec 16, 2009 at 11:58 AM, patrick o'leary pj...@pjaol.com wrote:

 SEVERE: java.lang.AbstractMethodError
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:439)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
at
 org.apache.solr.core.SolrCore.initValueSourceParsers(SolrCore.java:1469)
at org.apache.solr.core.SolrCore.init(SolrCore.java:549)
at

 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)

 And
 svn info
 Path: .
 URL: http://svn.apache.org/repos/asf/lucene/solr/trunk
 Repository Root: http://svn.apache.org/repos/asf
 Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
 Revision: 891117
 Node Kind: directory
 Schedule: normal
 Last Changed Author: koji
 Last Changed Rev: 890798
 Last Changed Date: 2009-12-15 06:13:59 -0800 (Tue, 15 Dec 2009)


I just wrote a custom ValueSourceParser which does not override the init
method and it loads fine on current trunk. Can you share your code?

-- 
Regards,
Shalin Shekhar Mangar.

Re: ValueSourceParser problem

2009-12-15 Thread Shalin Shekhar Mangar

On Wed, Dec 16, 2009 at 12:39 PM, patrick o'leary pj...@pjaol.com wrote:

 Yeah.. can't release that part mate, all you need is

 package com.pjaol;

 import org.apache.lucene.queryParser.ParseException;
 import org.apache.solr.search.FunctionQParser;
 import org.apache.solr.search.ValueSourceParser;
 import org.apache.solr.search.function.ValueSource;

 public class CustomValueSourceParser extends ValueSourceParser{

@Override
public ValueSource parse(FunctionQParser fp) throws ParseException {

System.out.println(*** Called);
return null;
}

 }


 And
 valueSourceParser name=social_a
 class=com.pjaol.CustomValueSourceParser
 /
 in your solrconfig.xml

 The parse method only gets called at query time


Patrick, this works for me. The string is printed in the console. Your
runtime classpath must have Solr 1.3 jars somewhere because the
ValueSourceParser#init was abstract in 1.3

http://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.3/src/java/org/apache/solr/search/ValueSourceParser.java

-- 
Regards,
Shalin Shekhar Mangar.

[jira] Resolved: (SOLR-1651) Incorrect dataimport handler package name in SolrResourceLoader

2009-12-14 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1651.
-

Resolution: Fixed

Committed revision 890243.

Thanks for the catch Akshay!

 Incorrect dataimport handler package name in SolrResourceLoader
 ---

 Key: SOLR-1651
 URL: https://issues.apache.org/jira/browse/SOLR-1651
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: Akshay K. Ukey
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: 1.5

 Attachments: SOLR-1651.patch


 packages String array used by findClass method in SolrResourceLoader has 
 value for dataimport handler package as handler.dataimport, must be 
 handler.dataimport.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1610) Add generics to SolrCache

2009-12-14 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1610.
-

Resolution: Fixed

Committed revision 890250.

Thanks Jason!

 Add generics to SolrCache
 -

 Key: SOLR-1610
 URL: https://issues.apache.org/jira/browse/SOLR-1610
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Jason Rutherglen
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: 1.5

 Attachments: SOLR-1610.patch


 Seems fairly simple for SolrCache to have generics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790577#action_12790577
 ] 

Shalin Shekhar Mangar commented on SOLR-1653:
-

bq. If there is no objections, I'll commit later today.

+1

Thanks Koji!

 add PatternReplaceCharFilter
 

 Key: SOLR-1653
 URL: https://issues.apache.org/jira/browse/SOLR-1653
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1653.patch, SOLR-1653.patch


 Add a new CharFilter that uses a regular expression for the target of replace 
 string in char stream.
 Usage:
 {code:title=schema.xml}
 fieldType name=textCharNorm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.PatternReplaceCharFilterFactory
 groupedPattern=([nN][oO]\.)\s*(\d+)
 replaceGroups=1,2 blockDelimiters=:;/
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1643) remove DIH-extras package

2009-12-14 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1643.
-

Resolution: Won't Fix

Reverted previous committed and moved TikaEntityProcessor and tests to extras.

Committed revision 890679.

 remove DIH-extras package
 -

 Key: SOLR-1643
 URL: https://issues.apache.org/jira/browse/SOLR-1643
 Project: Solr
  Issue Type: Sub-task
  Components: contrib - DataImportHandler
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1643.patch, SOLR-1643.patch


 Now that jars can be added directly using solrconfig.xml We may not really 
 need this extra package. We can compile and add this to the main 
 dataimporthandler.jar and specify in the instructions how to include the jars 
 for those components w/ external requirements such as 
 MailEntityProcessor/TikaEntityProcessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1139) SolrJ TermsComponent Query and Response Support

2009-12-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1139:


Attachment: SOLR-1139.patch

Updated patch for two params added by SOLR-1625.

I'll commit this shortly.

 SolrJ TermsComponent Query and Response Support
 ---

 Key: SOLR-1139
 URL: https://issues.apache.org/jira/browse/SOLR-1139
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.4
Reporter: Matt Weber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Attachments: SOLR-1139-WITH_SORT_SUPPORT.patch, SOLR-1139.patch, 
 SOLR-1139.patch, SOLR-1139.patch, SOLR-1139.patch, SOLR-1139.patch, 
 SOLR-1139.patch, SOLR-1139.patch


 SolrJ should support the new TermsComponent that was introduced in Solr 1.4.  
 It should be able to:
 - set TermsComponent query parameters via SolrQuery
 - parse the TermsComponent response

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1139) SolrJ TermsComponent Query and Response Support

2009-12-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1139.
-

   Resolution: Fixed
Fix Version/s: 1.5

Committed revision 890053.

Thanks Matt!

 SolrJ TermsComponent Query and Response Support
 ---

 Key: SOLR-1139
 URL: https://issues.apache.org/jira/browse/SOLR-1139
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.4
Reporter: Matt Weber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1139-WITH_SORT_SUPPORT.patch, SOLR-1139.patch, 
 SOLR-1139.patch, SOLR-1139.patch, SOLR-1139.patch, SOLR-1139.patch, 
 SOLR-1139.patch, SOLR-1139.patch


 SolrJ should support the new TermsComponent that was introduced in Solr 1.4.  
 It should be able to:
 - set TermsComponent query parameters via SolrQuery
 - parse the TermsComponent response

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1177) Distributed TermsComponent

2009-12-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-1177:


Attachment: SOLR-1177.patch

{code}
if (tc.getFrequency() = freqmin  tc.getFrequency() = freqmax) {
  fieldterms.add(tc.getTerm(), ((Number)tc.getFrequency()).intValue()); cnt++; 
}
{code}

I changed freqmin and freqmax to long and used Yonik's method to write int if 
possible or else switch to longs in the response.

I'll commit this shortly.

 Distributed TermsComponent
 --

 Key: SOLR-1177
 URL: https://issues.apache.org/jira/browse/SOLR-1177
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Matt Weber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, 
 SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, TermsComponent.java, 
 TermsComponent.patch


 TermsComponent should be distributed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1177) Distributed TermsComponent

2009-12-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-1177.
-

Resolution: Fixed

Committed revision 890199.

Thanks Matt!

 Distributed TermsComponent
 --

 Key: SOLR-1177
 URL: https://issues.apache.org/jira/browse/SOLR-1177
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Matt Weber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, 
 SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, TermsComponent.java, 
 TermsComponent.patch


 TermsComponent should be distributed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2009-12-13 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12790026#action_12790026
 ] 

Shalin Shekhar Mangar commented on SOLR-1653:
-

Koji, even after reading through the test, I do not understand how to use it. 
Are the characters in curly braces, written down for non-groups only? What if I 
want to remove one particular group?

It is always good to write a use-case and an example in the issue description 
itself.

 add PatternReplaceCharFilter
 

 Key: SOLR-1653
 URL: https://issues.apache.org/jira/browse/SOLR-1653
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1653.patch


 Add a new CharFilter that uses a regular expression for the target of replace 
 string in char stream.
 Usage:
 {code:title=schema.xml}
 fieldType name=textCharNorm class=solr.TextField 
 positionIncrementGap=100 
   analyzer
 charFilter class=solr.PatternReplaceCharFilterFactory
 groupedPattern=([nN][oO]\.)\s*(\d+)
 replaceGroups=1,2 blockDelimiters=:;/
 charFilter class=solr.MappingCharFilterFactory 
 mapping=mapping-ISOLatin1Accent.txt/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1177) Distributed TermsComponent

2009-12-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789790#action_12789790
 ] 

Shalin Shekhar Mangar commented on SOLR-1177:
-

Thanks Matt. Can you please attach the relevant portions to SOLR-1139. We can 
commit SOLR-1139 first and then resolve this one.

 Distributed TermsComponent
 --

 Key: SOLR-1177
 URL: https://issues.apache.org/jira/browse/SOLR-1177
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Matt Weber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, 
 SOLR-1177.patch, TermsComponent.java, TermsComponent.patch


 TermsComponent should be distributed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1651) Incorrect dataimport handler package name in SolrResourceLoader

2009-12-12 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-1651:
---

Assignee: Shalin Shekhar Mangar

 Incorrect dataimport handler package name in SolrResourceLoader
 ---

 Key: SOLR-1651
 URL: https://issues.apache.org/jira/browse/SOLR-1651
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: Akshay K. Ukey
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: 1.5

 Attachments: SOLR-1651.patch


 packages String array used by findClass method in SolrResourceLoader has 
 value for dataimport handler package as handler.dataimport, must be 
 handler.dataimport.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1177) Distributed TermsComponent

2009-12-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789795#action_12789795
 ] 

Shalin Shekhar Mangar commented on SOLR-1177:
-

bq. The latest SOLR-1139 patch is included inside the latest patch I attached 
to this ticket. Should I separate them? 

Yes. I'll commit SOLR-1139 first so remove those classes from the current patch.

PS: I'm sorry if I am confusing you. It is 3AM here and I'm a little confused 
myself :)

 Distributed TermsComponent
 --

 Key: SOLR-1177
 URL: https://issues.apache.org/jira/browse/SOLR-1177
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Matt Weber
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, 
 SOLR-1177.patch, TermsComponent.java, TermsComponent.patch


 TermsComponent should be distributed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1652) Allow single unit test to be executed from SOLR build.xml

2009-12-12 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789803#action_12789803
 ] 

Shalin Shekhar Mangar commented on SOLR-1652:
-

This capability already exists.

Run a single test using:
ant -Dtestcase=TestDistributedSearch clean test

Run tests inside a package (recursively):
ant -Dtestpackage=org.apache.solr.handler clean test

Run tests in package root:
ant -Dtestpackageroot=org.apache.solr.handler clean test

The above will exclude packages inside handler such as admin and component.

 Allow single unit test to be executed from SOLR build.xml
 -

 Key: SOLR-1652
 URL: https://issues.apache.org/jira/browse/SOLR-1652
 Project: Solr
  Issue Type: New Feature
  Components: Build
Affects Versions: 1.2, 1.3, 1.4
 Environment: My local MacBook
Reporter: Chris A. Mattmann
 Fix For: 1.5


 While playing around and running someone's example code in the form of a 
 test, I realized it might be nice to run a single test from the ant command 
 line when testing SOLR. To my knowledge, there is no way to do this. So, I 
 googled around and found a nice way of doing it. I'll contribute a patch that 
 allows you to do:
 ant runtest -Dtest=fully qualified class name or just class name no package 
 [-Dargs=jvm args for junit]
 which will run one of SOLR's unit tests at a time. You can also use *'s in 
 the -Dtest= to run many test cases that match the * expression too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1627 matches

Mail list logo