[jira] Updated: (SOLR-658) Allow Solr to load index from arbitrary directory in dataDir and Commit point
[ https://issues.apache.org/jira/browse/SOLR-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-658: --- Attachment: SOLR-658.patch Updated with a bug fix: {code} if (result != null result.trim().length() 0) { File tmp = new File(dataDir + s); if (tmp.exists() tmp.isDirectory()) result = dataDir + s; } {code} should be: {code} if (s != null s.trim().length() 0) { File tmp = new File(dataDir + s); if (tmp.exists() tmp.isDirectory()) result = dataDir + s; } {code} I'll commit shortly. Allow Solr to load index from arbitrary directory in dataDir and Commit point - Key: SOLR-658 URL: https://issues.apache.org/jira/browse/SOLR-658 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Noble Paul Assignee: Shalin Shekhar Mangar Fix For: 1.4 Attachments: SOLR-658.patch, SOLR-658.patch, SOLR-658.patch, SOLR-658.patch, SOLR-658.patch This is a requirement for java based Solr replication Usecase for arbitrary index directory: if the slave has a corrupted index and the filesystem does not allow overwriting files in use (NTFS) replication will fail. The solution is to copy the index from master to an alternate directory on slave and load indexreader/indexwriter from this alternate directory. Usecase for arbitrary commitpoint : Replication can also provide rollback feature . The rollback should be able to mention a comitpoint /generation so that rollback is possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-674) Add WriterParams interface for writer parameter strings
[ https://issues.apache.org/jira/browse/SOLR-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Kotthoff updated SOLR-674: --- Attachment: SOLR-674.patch Updating patch to trunk. Add WriterParams interface for writer parameter strings --- Key: SOLR-674 URL: https://issues.apache.org/jira/browse/SOLR-674 Project: Solr Issue Type: Task Affects Versions: 1.3 Reporter: Lars Kotthoff Priority: Minor Attachments: SOLR-647.patch, SOLR-674.patch Currently the request parameters specific to writers are extracted using hardcoded strings. The strings should be pulled out to an interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-686) single lock factory overwrites previous
[ https://issues.apache.org/jira/browse/SOLR-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Kotthoff updated SOLR-686: --- Attachment: SOLR-686.patch Syncing patch with trunk -- should I perhaps file this as a separate issue? single lock factory overwrites previous --- Key: SOLR-686 URL: https://issues.apache.org/jira/browse/SOLR-686 Project: Solr Issue Type: Bug Components: update Reporter: Yonik Seeley Assignee: Yonik Seeley Fix For: 1.3 Attachments: SOLR-686.patch, SOLR-686.patch, SOLR-686.patch On a core reload, the Directory is retrieved and a new single lock factory is set, effectively removing all previous locks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-769) Support Document and Search Result clustering
[ https://issues.apache.org/jira/browse/SOLR-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12638791#action_12638791 ] Grant Ingersoll commented on SOLR-769: -- Patch soon, as a start. I'm going to check in the basic directory structure and libs, and then provide a patch with the source that we can iterate on. Support Document and Search Result clustering - Key: SOLR-769 URL: https://issues.apache.org/jira/browse/SOLR-769 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Clustering is a useful tool for working with documents and search results, similar to the notion of dynamic faceting. Carrot2 (http://project.carrot2.org/) is a nice, BSD-licensed, library for doing search results clustering. Mahout (http://lucene.apache.org/mahout) is well suited for whole-corpus clustering. The patch I lays out a contrib module that starts off w/ an integration of a SearchComponent for doing clustering and an implementation using Carrot. In search results mode, it will use the DocList as the input for the cluster. While Carrot2 comes w/ a Solr input component, it is not the same as the SearchComponent that I have in that the Carrot example actually submits a query to Solr, whereas my SearchComponent is just chained into the Component list and uses the ResponseBuilder to add in the cluster results. While not fully fleshed out yet, the collection based mode will take in a list of ids or just use the whole collection and will produce clusters. Since this is a longer, typically offline task, there will need to be some type of storage mechanism (and replication??) for the clusters. I _may_ push this off to a separate JIRA issue, but I at least want to present the use case as part of the design of this component/contrib. It may even make sense that we split this out, such that the building piece is something like an UpdateProcessor and then the SearchComponent just acts as a lookup mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Work started: (SOLR-769) Support Document and Search Result clustering
[ https://issues.apache.org/jira/browse/SOLR-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on SOLR-769 started by Grant Ingersoll. Support Document and Search Result clustering - Key: SOLR-769 URL: https://issues.apache.org/jira/browse/SOLR-769 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Clustering is a useful tool for working with documents and search results, similar to the notion of dynamic faceting. Carrot2 (http://project.carrot2.org/) is a nice, BSD-licensed, library for doing search results clustering. Mahout (http://lucene.apache.org/mahout) is well suited for whole-corpus clustering. The patch I lays out a contrib module that starts off w/ an integration of a SearchComponent for doing clustering and an implementation using Carrot. In search results mode, it will use the DocList as the input for the cluster. While Carrot2 comes w/ a Solr input component, it is not the same as the SearchComponent that I have in that the Carrot example actually submits a query to Solr, whereas my SearchComponent is just chained into the Component list and uses the ResponseBuilder to add in the cluster results. While not fully fleshed out yet, the collection based mode will take in a list of ids or just use the whole collection and will produce clusters. Since this is a longer, typically offline task, there will need to be some type of storage mechanism (and replication??) for the clusters. I _may_ push this off to a separate JIRA issue, but I at least want to present the use case as part of the design of this component/contrib. It may even make sense that we split this out, such that the building piece is something like an UpdateProcessor and then the SearchComponent just acts as a lookup mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-769) Support Document and Search Result clustering
[ https://issues.apache.org/jira/browse/SOLR-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-769: - Attachment: clustering-libs.tar Clustering libs Support Document and Search Result clustering - Key: SOLR-769 URL: https://issues.apache.org/jira/browse/SOLR-769 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: clustering-libs.tar Clustering is a useful tool for working with documents and search results, similar to the notion of dynamic faceting. Carrot2 (http://project.carrot2.org/) is a nice, BSD-licensed, library for doing search results clustering. Mahout (http://lucene.apache.org/mahout) is well suited for whole-corpus clustering. The patch I lays out a contrib module that starts off w/ an integration of a SearchComponent for doing clustering and an implementation using Carrot. In search results mode, it will use the DocList as the input for the cluster. While Carrot2 comes w/ a Solr input component, it is not the same as the SearchComponent that I have in that the Carrot example actually submits a query to Solr, whereas my SearchComponent is just chained into the Component list and uses the ResponseBuilder to add in the cluster results. While not fully fleshed out yet, the collection based mode will take in a list of ids or just use the whole collection and will produce clusters. Since this is a longer, typically offline task, there will need to be some type of storage mechanism (and replication??) for the clusters. I _may_ push this off to a separate JIRA issue, but I at least want to present the use case as part of the design of this component/contrib. It may even make sense that we split this out, such that the building piece is something like an UpdateProcessor and then the SearchComponent just acts as a lookup mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-769) Support Document and Search Result clustering
[ https://issues.apache.org/jira/browse/SOLR-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-769: - Attachment: SOLR-769.patch First draft of a patch. Notes: 1. Carrot2 uses the snowball stemmers, but it shouldn't clash, b/c it actually slightly changes the names of them to be like englishStemmer (as opposed to EnglishStemmer). I'm debating whether or not to just re-implement this so that it can use the same snowball stemmers we use in Solr. Probably not a big deal. 2. I haven't implemented document clustering yet. To do this, I need to setup a background thread that will be spawned to do the clustering, since it is presumably going through some large set of documents and clustering them. To do this, it will probably require term vectors. This will introduce a dep. on Mahout, so I'll need a version of that library too. 3. It would be really cool for the Carrot2 implementation to support using other clustering algs besides Lingo. Basically, this just needs to be factored into the configuration and the jars included in the distribution. This is not a high priority for me at the moment. TODO: More tests. Decide on output format Implement doc. clustering framework part (i.e. spawning of threads, commands) Support Document and Search Result clustering - Key: SOLR-769 URL: https://issues.apache.org/jira/browse/SOLR-769 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: clustering-libs.tar, SOLR-769.patch Clustering is a useful tool for working with documents and search results, similar to the notion of dynamic faceting. Carrot2 (http://project.carrot2.org/) is a nice, BSD-licensed, library for doing search results clustering. Mahout (http://lucene.apache.org/mahout) is well suited for whole-corpus clustering. The patch I lays out a contrib module that starts off w/ an integration of a SearchComponent for doing clustering and an implementation using Carrot. In search results mode, it will use the DocList as the input for the cluster. While Carrot2 comes w/ a Solr input component, it is not the same as the SearchComponent that I have in that the Carrot example actually submits a query to Solr, whereas my SearchComponent is just chained into the Component list and uses the ResponseBuilder to add in the cluster results. While not fully fleshed out yet, the collection based mode will take in a list of ids or just use the whole collection and will produce clusters. Since this is a longer, typically offline task, there will need to be some type of storage mechanism (and replication??) for the clusters. I _may_ push this off to a separate JIRA issue, but I at least want to present the use case as part of the design of this component/contrib. It may even make sense that we split this out, such that the building piece is something like an UpdateProcessor and then the SearchComponent just acts as a lookup mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-84) Logo Contests
[ https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Vlcek updated SOLR-84: Attachment: apache_solr_burning.png Logo Contests - Key: SOLR-84 URL: https://issues.apache.org/jira/browse/SOLR-84 Project: Solr Issue Type: Improvement Reporter: Bertrand Delacretaz Priority: Minor Attachments: apache_solr_burning.png, logo-grid.jpg, logo-solr-d.jpg, logo-solr-e.jpg, logo-solr-source-files-take2.zip, solr-84-source-files.zip, solr-f.jpg, solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, solr-nick.gif, solr.jpg, solr.s1.jpg, solr.svg, solr_logo_it_is_burning.png, sslogo-solr-flare.jpg, sslogo-solr.jpg, sslogo-solr2-flare.jpg, sslogo-solr2.jpg, sslogo-solr3.jpg This issue was original a scratch pad for various ideas for new Logos. It is now being used as a repository for submissions for the Solr Logo Contest... http://wiki.apache.org/solr/LogoContest Note that many of the images currently attached are not eligible for the contest since they do not meet the official guidelines for new Apache project logos (in particular that the full project name Apache Solr must be included in the Logo). Only eligible attachments will be included in the official voting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-84) Logo Contests
[ https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Vlcek updated SOLR-84: Attachment: (was: solr_logo_it_is_burning.png) Logo Contests - Key: SOLR-84 URL: https://issues.apache.org/jira/browse/SOLR-84 Project: Solr Issue Type: Improvement Reporter: Bertrand Delacretaz Priority: Minor Attachments: apache_solr_burning.png, logo-grid.jpg, logo-solr-d.jpg, logo-solr-e.jpg, logo-solr-source-files-take2.zip, solr-84-source-files.zip, solr-f.jpg, solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG, solr-nick.gif, solr.jpg, solr.s1.jpg, solr.svg, sslogo-solr-flare.jpg, sslogo-solr.jpg, sslogo-solr2-flare.jpg, sslogo-solr2.jpg, sslogo-solr3.jpg This issue was original a scratch pad for various ideas for new Logos. It is now being used as a repository for submissions for the Solr Logo Contest... http://wiki.apache.org/solr/LogoContest Note that many of the images currently attached are not eligible for the contest since they do not meet the official guidelines for new Apache project logos (in particular that the full project name Apache Solr must be included in the Logo). Only eligible attachments will be included in the official voting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (SOLR-84) New Solr logo?
Andrzej, your ascii looks great ;-) However, I tried something different, see: https://issues.apache.org/jira/secure/attachment/12391946/apache_solr_burning.png If I have a chance then I will try to create also version with sun beams (based on my proposal #1http://picasaweb.google.cz/lukas.vlcek/Solr#5235620801281945858 ) instead of flame. Regards, Lukas On Wed, Oct 8, 2008 at 7:07 PM, Andrzej Bialecki [EMAIL PROTECTED] wrote: Lukáš Vlček wrote: Hi, I am glad you like the draft#1 (and actually I think the second design is not totally lost, just wipe out the Apache letters and you get it). But the problem is that the draft#1 (as it is today) would not make it into the contest due to violation of the strongest requirement: The logo must incorporate the full project name: Apache Solr That is the assigment (http://wiki.apache.org/solr/LogoContest). You can try to push the contest organizers, not me... How about a layout like this one (hopefully the ascii art makes it through email ...): ,--. A p a c h e |__ \|/ + ,-+ | - O - | |_/ --' /,|.\ +-- | \ -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com -- http://blog.lukas-vlcek.com/
[jira] Commented: (SOLR-577) added support for boosting fields and documents to python solr interface
[ https://issues.apache.org/jira/browse/SOLR-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12638797#action_12638797 ] Dorneles Tremea commented on SOLR-577: -- This issue can now be closed, as it's being tracked by: http://code.google.com/p/solrpy/issues/detail?id=6 added support for boosting fields and documents to python solr interface Key: SOLR-577 URL: https://issues.apache.org/jira/browse/SOLR-577 Project: Solr Issue Type: Improvement Components: clients - python Environment: linux, python Reporter: Rob Young Attachments: solr.py Added the ability to set boosts on fields and documents when indexing. This is done through two new classes solr.Document and solr.Field c = solr.SolrConnection(host='localhost:8081') c.add(id='123', name=solr.Field('this is a field', boost=1.5)) doc = solr.Document(boost=1.5) doc.add(solr.Field(name='title', value=a value for my field, boost=1.1)) c.addDoc(doc) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-216) Improvements to solr.py
[ https://issues.apache.org/jira/browse/SOLR-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12638803#action_12638803 ] Dorneles Tremea commented on SOLR-216: -- Looks like the above issues are now being addressed at http://code.google.com/p/solrpy/issues and so this ticket can be closed... Improvements to solr.py --- Key: SOLR-216 URL: https://issues.apache.org/jira/browse/SOLR-216 Project: Solr Issue Type: Improvement Components: clients - python Affects Versions: 1.2 Reporter: Jason Cater Assignee: Mike Klaas Priority: Trivial Attachments: solr-solrpy-r5.patch, solr.py, solr.py, solr.py, solr.py, test_all.py I've taken the original solr.py code and extended it to include higher-level functions. * Requires python 2.3+ * Supports SSL (https://) schema * Conforms (mostly) to PEP 8 -- the Python Style Guide * Provides a high-level results object with implicit data type conversion * Supports batching of update commands -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-769) Support Document and Search Result clustering
[ https://issues.apache.org/jira/browse/SOLR-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-769: - Attachment: SOLR-769.patch More updates, added example Support Document and Search Result clustering - Key: SOLR-769 URL: https://issues.apache.org/jira/browse/SOLR-769 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: clustering-libs.tar, SOLR-769.patch, SOLR-769.patch Clustering is a useful tool for working with documents and search results, similar to the notion of dynamic faceting. Carrot2 (http://project.carrot2.org/) is a nice, BSD-licensed, library for doing search results clustering. Mahout (http://lucene.apache.org/mahout) is well suited for whole-corpus clustering. The patch I lays out a contrib module that starts off w/ an integration of a SearchComponent for doing clustering and an implementation using Carrot. In search results mode, it will use the DocList as the input for the cluster. While Carrot2 comes w/ a Solr input component, it is not the same as the SearchComponent that I have in that the Carrot example actually submits a query to Solr, whereas my SearchComponent is just chained into the Component list and uses the ResponseBuilder to add in the cluster results. While not fully fleshed out yet, the collection based mode will take in a list of ids or just use the whole collection and will produce clusters. Since this is a longer, typically offline task, there will need to be some type of storage mechanism (and replication??) for the clusters. I _may_ push this off to a separate JIRA issue, but I at least want to present the use case as part of the design of this component/contrib. It may even make sense that we split this out, such that the building piece is something like an UpdateProcessor and then the SearchComponent just acts as a lookup mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-769) Support Document and Search Result clustering
[ https://issues.apache.org/jira/browse/SOLR-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12638814#action_12638814 ] Grant Ingersoll commented on SOLR-769: -- Still to do, more testing, get feedback, implement basics of doc. clustering. This last piece will take some more design work. Also need to validate some more that the results make sense for search results clustering, but my first look suggests they do. Support Document and Search Result clustering - Key: SOLR-769 URL: https://issues.apache.org/jira/browse/SOLR-769 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: clustering-libs.tar, SOLR-769.patch, SOLR-769.patch Clustering is a useful tool for working with documents and search results, similar to the notion of dynamic faceting. Carrot2 (http://project.carrot2.org/) is a nice, BSD-licensed, library for doing search results clustering. Mahout (http://lucene.apache.org/mahout) is well suited for whole-corpus clustering. The patch I lays out a contrib module that starts off w/ an integration of a SearchComponent for doing clustering and an implementation using Carrot. In search results mode, it will use the DocList as the input for the cluster. While Carrot2 comes w/ a Solr input component, it is not the same as the SearchComponent that I have in that the Carrot example actually submits a query to Solr, whereas my SearchComponent is just chained into the Component list and uses the ResponseBuilder to add in the cluster results. While not fully fleshed out yet, the collection based mode will take in a list of ids or just use the whole collection and will produce clusters. Since this is a longer, typically offline task, there will need to be some type of storage mechanism (and replication??) for the clusters. I _may_ push this off to a separate JIRA issue, but I at least want to present the use case as part of the design of this component/contrib. It may even make sense that we split this out, such that the building piece is something like an UpdateProcessor and then the SearchComponent just acts as a lookup mechanism. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.