[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746395#action_12746395 ] Uri Boness commented on SOLR-1163: -- bq. Does a GWT client application have a clean license? If having a pure Apache 2 license is considered to be clean, then yes. bq. Are there any other GWT apps in the Apache project? No as far as I know. But you do have [LucidGaze|http://www.lucidimagination.com/Downloads/Certified-Distributions#lucidgaze] which is a Solr monitoring tool and I think it's also a GWT application. bq. +1. This is great. Thanks, you can also vote for it ;-) bq. The Simile project has some nice data explorer UIs. The Simile-Widget gallery displays them. Thanks for the suggestion. I know this project, but from my experience some of their widgets don't perform really well. Personally, when it comes to data visualization I think flash is the best technology we have at the moment and it's quite easy to interact with it via Javascript and GWT (that's how Google does for most of their applications/services: analytics, finances, etc..) > Solr Explorer - A generic GWT client for Solr > - > > Key: SOLR-1163 > URL: https://issues.apache.org/jira/browse/SOLR-1163 > Project: Solr > Issue Type: New Feature > Components: web gui >Affects Versions: 1.3 >Reporter: Uri Boness > Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch > > > The attached patch is a GWT generic client for solr. It is currently > standalone, meaning that once built, one can open the generated HTML file in > a browser and communicate with any deployed solr. It is configured with it's > own configuration file, where one can configure the solr instance/core to > connect to. Since it's currently standalone and completely client side based, > it uses JSON with padding (cross-side scripting) to connect to remote solr > servers. Some of the supported features: > - Simple query search > - Sorting - one can dynamically define new sort criterias > - Search results are rendered very much like Google search results are > rendered. It is also possible to view all stored field values for every hit. > - Custom hit rendering - It is possible to show thumbnails (images) per hit > and also customize a view for a hit based on html templates > - Faceting - one can dynamically define field and query facets via the UI. it > is also possible to pre-configure these facets in the configuration file. > - Highlighting - you can dynamically configure highlighting. it can also be > pre-configured in the configuration file > - Spellchecking - you can dynamically configure spell checking. Can also be > done in the configuration file. Supports collation. It is also possible to > send "build" and "reload" commands. > - Data import handler - if used, it is possible to send a "full-import" and > "status" command ("delta-import" is not implemented yet, but it's easy to add) > - Console - For development time, there's a small console which can help to > better understand what's going on behind the scenes. One can use it to: > ** view the client logs > ** browse the solr scheme > ** View a break down of the current search context > ** View a break down of the query URL that is sent to solr > ** View the raw JSON response returning from Solr > This client is actually a platform that can be greatly extended for more > things. The goal is to have a client where the explorer part is just one view > of it. Other future views include: Monitoring, Administration, Query Builder, > DataImportHandler configuration, and more... > To get a better view of what's currently possible. We've set up a public > version of this client at: http://search.jteam.nl/explorer. This client is > configured with one solr instance where crawled YouTube movies where indexed. > You can also check out a screencast for this deployed client: > http://search.jteam.nl/help > The patch created a new folder in the contrib. directory. Since the patch > doesn't contain binaries, an additional zip file is provides that needs to be > extract to add all the required graphics. This module is maven2 based and is > configured in such a way that all GWT related tools/libraries are > automatically downloaded when the modules is compiled. One of the artifacts > of the build is a war file which can be deployed in any servlet container. > NOTE: this client works best on WebKit based browsers (for performance > reason) but also works on firefox and ie 7+. That said, it should be taken > into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1378) Add reference to Packt's Solr book.
[ https://issues.apache.org/jira/browse/SOLR-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746392#action_12746392 ] Yonik Seeley commented on SOLR-1378: David, I needed to put tags around the book image in the news section. "forrest run" (interactive mode) does not detect all the errors that the straight "forrest" will. That said, I can't build the current site myself... not even on a clean checkout with your patch not applied (I've only tried forrest 0.8 so far). Anyone else? > Add reference to Packt's Solr book. > --- > > Key: SOLR-1378 > URL: https://issues.apache.org/jira/browse/SOLR-1378 > Project: Solr > Issue Type: Task >Reporter: David Smiley > Attachments: solr-book-image.jpg, solr_book_packt.patch > > > I've attached news of the Solr update. It includes an image under the left > nav area, and a news item with the same image. The text is as follows: > David Smiley and Eric Pugh are proud to introduce the first book on Solr, > "Solr 1.4 Enterprise Search Server" from Packt Publishing. > This book is a comprehensive reference guide for nearly every feature Solr > has to offer. It serves the reader right from initiation to development to > deployment. It also comes with complete running examples to demonstrate its > use and show how to integrate it with other languages and frameworks. > To keep this interesting and realistic, it uses a large open source set of > metadata about artists, releases, and tracks courtesy of the MusicBrainz.org > project. Using this data as a testing ground for Solr, you will learn how to > import this data in various ways from CSV to XML to database access. You will > then learn how to search this data in a myriad of ways, including Solr's rich > query syntax, "boosting" match scores based on record data and other means, > about searching across multiple fields with different boosts, getting facets > on the results, auto-complete user queries, spell-correcting searches, > highlighting queried text in search results, and so on. > After this thorough tour, you'll see working examples of integrating a > variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, > PHP, and Python. > Finally, this book covers various deployment considerations to include > indexing strategies and performance-oriented configuration that will enable > you to scale Solr to meet the needs of a high-volume site. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1375) BloomFilter on a field
[ https://issues.apache.org/jira/browse/SOLR-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated SOLR-1375: --- Attachment: SOLR-1375.patch * Bug fixes * Core name included in response * Wiki is located at http://wiki.apache.org/solr/BloomIndexComponent > BloomFilter on a field > -- > > Key: SOLR-1375 > URL: https://issues.apache.org/jira/browse/SOLR-1375 > Project: Solr > Issue Type: New Feature > Components: update >Affects Versions: 1.4 >Reporter: Jason Rutherglen >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1375.patch, SOLR-1375.patch, SOLR-1375.patch > > Original Estimate: 120h > Remaining Estimate: 120h > > * A bloom filter is a read only probabilistic set. Its useful > for verifying a key exists in a set, though it returns false > positives. http://en.wikipedia.org/wiki/Bloom_filter > * The use case is indexing in Hadoop and checking for duplicates > against a Solr cluster (which when using term dictionary or a > query) is too slow and exceeds the time consumed for indexing. > When a match is found, the host, segment, and term are returned. > If the same term is found on multiple servers, multiple results > are returned by the distributed process. (We'll need to add in > the core name I just realized). > * When new segments are created, and commit is called, a new > bloom filter is generated from a given field (default:id) by > iterating over the term dictionary values. There's a bloom > filter file per segment, which is managed on each Solr shard. > When segments are merged away, their corresponding .blm files is > also removed. In a future version we'll have a central server > for the bloom filters so we're not abusing the thread pool of > the Solr proxy and the networking of the Solr cluster (this will > be done sooner than later after testing this version). I held > off because the central server requires syncing the Solr > servers' files (which is like reverse replication). > * The patch uses the BloomFilter from Hadoop 0.20. I want to jar > up only the necessary classes so we don't have a giant Hadoop > jar in lib. > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/bloom/BloomFilter.html > * Distributed code is added and seems to work, I extended > TestDistributedSearch to test over multiple HTTP servers. I > chose this approach rather than the manual method used by (for > example) TermVectorComponent.testDistributed because I'm new to > Solr's distributed search and wanted to learn how it works (the > stages are confusing). Using this method, I didn't need to setup > multiple tomcat servers and manually execute tests. > * We need more of the bloom filter options passable via > solrconfig > * I'll add more test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: distributed search components
I was working on MLT component with patch SOLR-788. On Aug 21, 2009, at 6:49 PM, Yonik Seeley wrote: On Fri, Aug 21, 2009 at 6:35 PM, Mike Anderson wrote: I've been trying to dissect the MLT component and understand how it works. Every-time I think I have the process figured it out I somehow just end up more confused. I don't think MTL supports distributed search. http://wiki.apache.org/solr/DistributedSearch -Yonik http://www.lucidimagination.com
[jira] Commented: (SOLR-1369) Add HSQLDB Jar to example-dih
[ https://issues.apache.org/jira/browse/SOLR-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746363#action_12746363 ] Eric Pugh commented on SOLR-1369: - I tweaked the docs to point to HSQLDB 1.8. I'll leave the "unzip hsqldb.zip" and "svn add hsqldb/" and "svn ci -m 'expanding example to make getting started easier' hsqldb/" to a committer versus attaching a large patch file! Eric > Add HSQLDB Jar to example-dih > - > > Key: SOLR-1369 > URL: https://issues.apache.org/jira/browse/SOLR-1369 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Reporter: Eric Pugh > > I went back to show someone the Example-DIH and followed the wiki page > directions. I then ran into an error because the hsqldb uses 1.8, and the > hsqldb.jar I downloaded from hsqldb.org was 1.9. The 1.9 rc shows up above > the 1.8 version. > I see two approaches: 1) Be clearer on the docs, maybe embed a direct link > to > http://sourceforge.net/projects/hsqldb/files/hsqldb/hsqldb_1_8_0/hsqldb_1_8_0_10.zip/download. > > 2) include hsqldb.jar in the example. I am assuming the reason this wasn't > done was because of licensing issues?? > Also, any real reason to zip the hsqldb database? It's under 20k expanded > and adds another step. > Figured I'd get the wisdom of the crowds before changing. > Eric -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: distributed search components
On Fri, Aug 21, 2009 at 12:52 PM, Mike Anderson wrote: > I'm trying to make my way through learning how to modify and write > distributed search components. The whole ResponseBuilder stuff is really a first pass - it obviously could use refinement. As you go through, it would be great if you could keep in mind how things could be improved, in addition to how it currently works. Don't try to make sense of this as anyone's idea of "ideal code" but rather "code that currently works". > A few questions > > 1. in SearchHandler, when the query is broken down and sent to each shard, > will this request make it's way to the process() method of the component > (because it will look like a non-distributed request to the SearchHandler of > the shard)? Yes. > 2. the comment above the response handling loop (in SearchHandler) says that > if any requests are added while in the loop, the loop will break and make > the request immediately. I see that the loop will exit if there is an > exception or if there are no more responses, but I don't see how the new > requests will be called unless it goes through the entire loop again. Here's the code. // now wait for replies, but if anyone puts more requests on // the outgoing queue, send them out immediately (by exiting // this loop) while (rb.outgoing.size() == 0) { [ receive a response, and process the response ] } If any code processing the response adds another request to the outgoing queue, then the loop will break and the new outgoing requests will be sent. So it's not *quite* immediate... it's after components have processed the response. > 3. if one adds a request to rb in the handleResponses method, this wouldn't > necessarily be called, namely in the event that none of the components > override the distributedProcess method, and the loop only goes through once. > > 4. where can I learn more about the shard.purpose variable? Where in the > component should this be set, if anywhere? public final static int PURPOSE_PRIVATE = 0x01; public final static int PURPOSE_GET_TERM_DFS= 0x02; public final static int PURPOSE_GET_TOP_IDS = 0x04; public final static int PURPOSE_REFINE_TOP_IDS = 0x08; public final static int PURPOSE_GET_FACETS = 0x10; public final static int PURPOSE_REFINE_FACETS = 0x20; public final static int PURPOSE_GET_FIELDS = 0x40; public final static int PURPOSE_GET_HIGHLIGHTS = 0x80; public final static int PURPOSE_GET_DEBUG =0x100; public final static int PURPOSE_GET_STATS =0x200; public int purpose; // the purpose of this request It's for declaring what a request is for, so other components can piggyback on that request if they want and avoid sending a separate request. For example, the highlighting component chooses to request highlighting only by piggybacking on requests to retrieve stored fields. // Turn on highlighting only only when retrieving fields if ((sreq.purpose & ShardRequest.PURPOSE_GET_FIELDS) != 0) { sreq.purpose |= ShardRequest.PURPOSE_GET_HIGHLIGHTS; // should already be true... sreq.params.set(HighlightParams.HIGHLIGHT, "true"); The facet component will also look for suitable other outgoing requests to piggyback on and modify, and if it can't find any, will create a new request. See FacetComponent.java:134 Some of these are currently unused - PURPOSE_GET_TERM_DFS for example, would be for getting the doc freqs to implement a global idf. -Yonik http://www.lucidimagination.com > I've taken a look at the wiki page, but if there is more documentation > elsewhere please point me towards it. > > Thanks in advance, > Mike
Re: distributed search components
On Fri, Aug 21, 2009 at 6:35 PM, Mike Anderson wrote: > I've been trying to dissect the MLT component and understand how it works. > Every-time I think I have the process figured it out I somehow just end up > more confused. I don't think MTL supports distributed search. http://wiki.apache.org/solr/DistributedSearch -Yonik http://www.lucidimagination.com
[jira] Commented: (SOLR-1377) Force TokenizerFactory to create a Tokenizer rather then TokenStream
[ https://issues.apache.org/jira/browse/SOLR-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746311#action_12746311 ] Yonik Seeley commented on SOLR-1377: bq. For the Pattern implementation, all the tokens are created beforehand and are just passed off with iter.next(), so if the input changes, the whole thing would need to change. And it does now... I moved the creation of the Token to init() so it's recreated with every reset. bq. Any reason not to implement reset on: TrieTokenizerFactory? TrieTokenizer (right below the factory) already implements reset(Reader). > Force TokenizerFactory to create a Tokenizer rather then TokenStream > - > > Key: SOLR-1377 > URL: https://issues.apache.org/jira/browse/SOLR-1377 > Project: Solr > Issue Type: New Feature > Components: Analysis >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Fix For: 1.4 > > Attachments: SOLR-1377-Tokenizer.patch, SOLR-1377.patch > > > The new token reuse classes require that they are created with a Tokenizer. > The solr TokenizerFactory interface currently makes a TokenStream. > Although this is an API breaking change, the alternative is to just document > that it needs to be a Tokenizer instance and throw an error when it is not. > For more discussion, see: > http://www.lucidimagination.com/search/document/272b8c4e6198d887/trunk_classcastexception_with_basetokenizerfactory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: distributed search components
I've been trying to dissect the MLT component and understand how it works. Every-time I think I have the process figured it out I somehow just end up more confused. Here is my so far best guess at how the process and flow work: 1. request comes in, and is routed to distributed section of SearchHandler 2. request is sent to each shard 3. after the shard returns a list of Doc IDs, new MLT requests are created, one for each Doc ID. (this happens in responseHandler()) 4. each MLT request is processed on the same shard (this happens in process()) 5. shard returns MLT results, which are collated (this happens in finishedStage()) although I don't think this is quite right because it doesn't match my print statements. I also noticed that the Purpose isn't 400 but 401. Whats up with this? is 401 a code for something else? (as an aside, is it unsafe to assume that the logs will appear in actual chronological order?) Any advice or pointers at this point would be greatly appreciated.. I think I'm going in circles. -mike On Aug 21, 2009, at 12:54 PM, Jason Rutherglen wrote: Mike, I'm also finding the Solr distributed process to be confusing. Lets try to add things to the wiki as we learn them? -J On Fri, Aug 21, 2009 at 9:52 AM, Mike Anderson wrote: I'm trying to make my way through learning how to modify and write distributed search components. A few questions 1. in SearchHandler, when the query is broken down and sent to each shard, will this request make it's way to the process() method of the component (because it will look like a non-distributed request to the SearchHandler of the shard)? 2. the comment above the response handling loop (in SearchHandler) says that if any requests are added while in the loop, the loop will break and make the request immediately. I see that the loop will exit if there is an exception or if there are no more responses, but I don't see how the new requests will be called unless it goes through the entire loop again. 3. if one adds a request to rb in the handleResponses method, this wouldn't necessarily be called, namely in the event that none of the components override the distributedProcess method, and the loop only goes through once. 4. where can I learn more about the shard.purpose variable? Where in the component should this be set, if anywhere? I've taken a look at the wiki page, but if there is more documentation elsewhere please point me towards it. Thanks in advance, Mike
[jira] Commented: (SOLR-1377) Force TokenizerFactory to create a Tokenizer rather then TokenStream
[ https://issues.apache.org/jira/browse/SOLR-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746287#action_12746287 ] Ryan McKinley commented on SOLR-1377: - Is reset gaurenteed to be called on the same Reader? For the Pattern implementation, all the tokens are created beforehand and are just passed off with iter.next(), so if the input changes, the whole thing would need to change. + public void reset(Reader input) throws IOException { + super.reset(input); + init(); + } Any reason not to implement reset on: TrieTokenizerFactory? > Force TokenizerFactory to create a Tokenizer rather then TokenStream > - > > Key: SOLR-1377 > URL: https://issues.apache.org/jira/browse/SOLR-1377 > Project: Solr > Issue Type: New Feature > Components: Analysis >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Fix For: 1.4 > > Attachments: SOLR-1377-Tokenizer.patch, SOLR-1377.patch > > > The new token reuse classes require that they are created with a Tokenizer. > The solr TokenizerFactory interface currently makes a TokenStream. > Although this is an API breaking change, the alternative is to just document > that it needs to be a Tokenizer instance and throw an error when it is not. > For more discussion, see: > http://www.lucidimagination.com/search/document/272b8c4e6198d887/trunk_classcastexception_with_basetokenizerfactory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1377) Force TokenizerFactory to create a Tokenizer rather then TokenStream
[ https://issues.apache.org/jira/browse/SOLR-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-1377: --- Attachment: SOLR-1377.patch Uploading another patch based on yours that implements reuse (reset(Reader)) for the Tokenizers. +1 > Force TokenizerFactory to create a Tokenizer rather then TokenStream > - > > Key: SOLR-1377 > URL: https://issues.apache.org/jira/browse/SOLR-1377 > Project: Solr > Issue Type: New Feature > Components: Analysis >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Fix For: 1.4 > > Attachments: SOLR-1377-Tokenizer.patch, SOLR-1377.patch > > > The new token reuse classes require that they are created with a Tokenizer. > The solr TokenizerFactory interface currently makes a TokenStream. > Although this is an API breaking change, the alternative is to just document > that it needs to be a Tokenizer instance and throw an error when it is not. > For more discussion, see: > http://www.lucidimagination.com/search/document/272b8c4e6198d887/trunk_classcastexception_with_basetokenizerfactory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746261#action_12746261 ] Lance Norskog commented on SOLR-1163: - Does a GWT client application have a clean license? Are there any other GWT apps in the Apache project? +1. This is great. The [Simile|http://simile.mit.edu/] project has some nice data explorer UIs. The [Simile-Widget|http://www.simile-widgets.org/] gallery displays them. > Solr Explorer - A generic GWT client for Solr > - > > Key: SOLR-1163 > URL: https://issues.apache.org/jira/browse/SOLR-1163 > Project: Solr > Issue Type: New Feature > Components: web gui >Affects Versions: 1.3 >Reporter: Uri Boness > Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch > > > The attached patch is a GWT generic client for solr. It is currently > standalone, meaning that once built, one can open the generated HTML file in > a browser and communicate with any deployed solr. It is configured with it's > own configuration file, where one can configure the solr instance/core to > connect to. Since it's currently standalone and completely client side based, > it uses JSON with padding (cross-side scripting) to connect to remote solr > servers. Some of the supported features: > - Simple query search > - Sorting - one can dynamically define new sort criterias > - Search results are rendered very much like Google search results are > rendered. It is also possible to view all stored field values for every hit. > - Custom hit rendering - It is possible to show thumbnails (images) per hit > and also customize a view for a hit based on html templates > - Faceting - one can dynamically define field and query facets via the UI. it > is also possible to pre-configure these facets in the configuration file. > - Highlighting - you can dynamically configure highlighting. it can also be > pre-configured in the configuration file > - Spellchecking - you can dynamically configure spell checking. Can also be > done in the configuration file. Supports collation. It is also possible to > send "build" and "reload" commands. > - Data import handler - if used, it is possible to send a "full-import" and > "status" command ("delta-import" is not implemented yet, but it's easy to add) > - Console - For development time, there's a small console which can help to > better understand what's going on behind the scenes. One can use it to: > ** view the client logs > ** browse the solr scheme > ** View a break down of the current search context > ** View a break down of the query URL that is sent to solr > ** View the raw JSON response returning from Solr > This client is actually a platform that can be greatly extended for more > things. The goal is to have a client where the explorer part is just one view > of it. Other future views include: Monitoring, Administration, Query Builder, > DataImportHandler configuration, and more... > To get a better view of what's currently possible. We've set up a public > version of this client at: http://search.jteam.nl/explorer. This client is > configured with one solr instance where crawled YouTube movies where indexed. > You can also check out a screencast for this deployed client: > http://search.jteam.nl/help > The patch created a new folder in the contrib. directory. Since the patch > doesn't contain binaries, an additional zip file is provides that needs to be > extract to add all the required graphics. This module is maven2 based and is > configured in such a way that all GWT related tools/libraries are > automatically downloaded when the modules is compiled. One of the artifacts > of the build is a war file which can be deployed in any servlet container. > NOTE: this client works best on WebKit based browsers (for performance > reason) but also works on firefox and ie 7+. That said, it should be taken > into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1378) Add reference to Packt's Solr book.
[ https://issues.apache.org/jira/browse/SOLR-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-1378: --- Attachment: solr-book-image.jpg solr_book_packt.patch The image goes here: src\site\src\documentation\content\xdocs\images\solr-book-image.jpg > Add reference to Packt's Solr book. > --- > > Key: SOLR-1378 > URL: https://issues.apache.org/jira/browse/SOLR-1378 > Project: Solr > Issue Type: Task >Reporter: David Smiley > Attachments: solr-book-image.jpg, solr_book_packt.patch > > > I've attached news of the Solr update. It includes an image under the left > nav area, and a news item with the same image. The text is as follows: > David Smiley and Eric Pugh are proud to introduce the first book on Solr, > "Solr 1.4 Enterprise Search Server" from Packt Publishing. > This book is a comprehensive reference guide for nearly every feature Solr > has to offer. It serves the reader right from initiation to development to > deployment. It also comes with complete running examples to demonstrate its > use and show how to integrate it with other languages and frameworks. > To keep this interesting and realistic, it uses a large open source set of > metadata about artists, releases, and tracks courtesy of the MusicBrainz.org > project. Using this data as a testing ground for Solr, you will learn how to > import this data in various ways from CSV to XML to database access. You will > then learn how to search this data in a myriad of ways, including Solr's rich > query syntax, "boosting" match scores based on record data and other means, > about searching across multiple fields with different boosts, getting facets > on the results, auto-complete user queries, spell-correcting searches, > highlighting queried text in search results, and so on. > After this thorough tour, you'll see working examples of integrating a > variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, > PHP, and Python. > Finally, this book covers various deployment considerations to include > indexing strategies and performance-oriented configuration that will enable > you to scale Solr to meet the needs of a high-volume site. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1377) Force TokenizerFactory to create a Tokenizer rather then TokenStream
[ https://issues.apache.org/jira/browse/SOLR-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-1377: Attachment: SOLR-1377-Tokenizer.patch Here is a patch that: 1. Changes the TokenizerFactory to return a Tokenizer 2. Updates all TokenizerFactory classes to explicitly return a Tokenizer 3. Changes the PatternTokenizerFactory to return a Tokenizer 4. adds a test that calls PatternTokenizer - - - Since this is an API breaking change, I added this to the "Upgrading from Solr 1.3" section of CHANGES.txt: {panel} The TokenizerFactory API has changed to explicitly return a Tokenizer rather then a TokenStream (that may be or may not be a Tokenizer). This change is required to take advantage of the Token reuse improvements in lucene 2.9. For more information, see SOLR-1377. {panel} I'll wait for two +1 votes on this, since it does break back compatibility > Force TokenizerFactory to create a Tokenizer rather then TokenStream > - > > Key: SOLR-1377 > URL: https://issues.apache.org/jira/browse/SOLR-1377 > Project: Solr > Issue Type: New Feature > Components: Analysis >Reporter: Ryan McKinley >Assignee: Ryan McKinley > Fix For: 1.4 > > Attachments: SOLR-1377-Tokenizer.patch > > > The new token reuse classes require that they are created with a Tokenizer. > The solr TokenizerFactory interface currently makes a TokenStream. > Although this is an API breaking change, the alternative is to just document > that it needs to be a Tokenizer instance and throw an error when it is not. > For more discussion, see: > http://www.lucidimagination.com/search/document/272b8c4e6198d887/trunk_classcastexception_with_basetokenizerfactory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1378) Add reference to Packt's Solr book.
Add reference to Packt's Solr book. --- Key: SOLR-1378 URL: https://issues.apache.org/jira/browse/SOLR-1378 Project: Solr Issue Type: Task Reporter: David Smiley I've attached news of the Solr update. It includes an image under the left nav area, and a news item with the same image. The text is as follows: David Smiley and Eric Pugh are proud to introduce the first book on Solr, "Solr 1.4 Enterprise Search Server" from Packt Publishing. This book is a comprehensive reference guide for nearly every feature Solr has to offer. It serves the reader right from initiation to development to deployment. It also comes with complete running examples to demonstrate its use and show how to integrate it with other languages and frameworks. To keep this interesting and realistic, it uses a large open source set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project. Using this data as a testing ground for Solr, you will learn how to import this data in various ways from CSV to XML to database access. You will then learn how to search this data in a myriad of ways, including Solr's rich query syntax, "boosting" match scores based on record data and other means, about searching across multiple fields with different boosts, getting facets on the results, auto-complete user queries, spell-correcting searches, highlighting queried text in search results, and so on. After this thorough tour, you'll see working examples of integrating a variety of technologies with Solr such as Java, JavaScript, Drupal, Ruby, PHP, and Python. Finally, this book covers various deployment considerations to include indexing strategies and performance-oriented configuration that will enable you to scale Solr to meet the needs of a high-volume site. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1375) BloomFilter on a field
[ https://issues.apache.org/jira/browse/SOLR-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated SOLR-1375: --- Attachment: SOLR-1375.patch * The Hadoop BloomFilter code is included in the patch > BloomFilter on a field > -- > > Key: SOLR-1375 > URL: https://issues.apache.org/jira/browse/SOLR-1375 > Project: Solr > Issue Type: New Feature > Components: update >Affects Versions: 1.4 >Reporter: Jason Rutherglen >Priority: Minor > Fix For: 1.5 > > Attachments: SOLR-1375.patch, SOLR-1375.patch > > Original Estimate: 120h > Remaining Estimate: 120h > > * A bloom filter is a read only probabilistic set. Its useful > for verifying a key exists in a set, though it returns false > positives. http://en.wikipedia.org/wiki/Bloom_filter > * The use case is indexing in Hadoop and checking for duplicates > against a Solr cluster (which when using term dictionary or a > query) is too slow and exceeds the time consumed for indexing. > When a match is found, the host, segment, and term are returned. > If the same term is found on multiple servers, multiple results > are returned by the distributed process. (We'll need to add in > the core name I just realized). > * When new segments are created, and commit is called, a new > bloom filter is generated from a given field (default:id) by > iterating over the term dictionary values. There's a bloom > filter file per segment, which is managed on each Solr shard. > When segments are merged away, their corresponding .blm files is > also removed. In a future version we'll have a central server > for the bloom filters so we're not abusing the thread pool of > the Solr proxy and the networking of the Solr cluster (this will > be done sooner than later after testing this version). I held > off because the central server requires syncing the Solr > servers' files (which is like reverse replication). > * The patch uses the BloomFilter from Hadoop 0.20. I want to jar > up only the necessary classes so we don't have a giant Hadoop > jar in lib. > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/bloom/BloomFilter.html > * Distributed code is added and seems to work, I extended > TestDistributedSearch to test over multiple HTTP servers. I > chose this approach rather than the manual method used by (for > example) TermVectorComponent.testDistributed because I'm new to > Solr's distributed search and wanted to learn how it works (the > stages are confusing). Using this method, I didn't need to setup > multiple tomcat servers and manually execute tests. > * We need more of the bloom filter options passable via > solrconfig > * I'll add more test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1377) Force TokenizerFactory to create a Tokenizer rather then TokenStream
Force TokenizerFactory to create a Tokenizer rather then TokenStream - Key: SOLR-1377 URL: https://issues.apache.org/jira/browse/SOLR-1377 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.4 The new token reuse classes require that they are created with a Tokenizer. The solr TokenizerFactory interface currently makes a TokenStream. Although this is an API breaking change, the alternative is to just document that it needs to be a Tokenizer instance and throw an error when it is not. For more discussion, see: http://www.lucidimagination.com/search/document/272b8c4e6198d887/trunk_classcastexception_with_basetokenizerfactory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: distributed search components
Mike, I'm also finding the Solr distributed process to be confusing. Lets try to add things to the wiki as we learn them? -J On Fri, Aug 21, 2009 at 9:52 AM, Mike Anderson wrote: > I'm trying to make my way through learning how to modify and write > distributed search components. > > A few questions > > 1. in SearchHandler, when the query is broken down and sent to each shard, > will this request make it's way to the process() method of the component > (because it will look like a non-distributed request to the SearchHandler of > the shard)? > > 2. the comment above the response handling loop (in SearchHandler) says that > if any requests are added while in the loop, the loop will break and make > the request immediately. I see that the loop will exit if there is an > exception or if there are no more responses, but I don't see how the new > requests will be called unless it goes through the entire loop again. > > 3. if one adds a request to rb in the handleResponses method, this wouldn't > necessarily be called, namely in the event that none of the components > override the distributedProcess method, and the loop only goes through once. > > 4. where can I learn more about the shard.purpose variable? Where in the > component should this be set, if anywhere? > > > I've taken a look at the wiki page, but if there is more documentation > elsewhere please point me towards it. > > Thanks in advance, > Mike > >
distributed search components
I'm trying to make my way through learning how to modify and write distributed search components. A few questions 1. in SearchHandler, when the query is broken down and sent to each shard, will this request make it's way to the process() method of the component (because it will look like a non-distributed request to the SearchHandler of the shard)? 2. the comment above the response handling loop (in SearchHandler) says that if any requests are added while in the loop, the loop will break and make the request immediately. I see that the loop will exit if there is an exception or if there are no more responses, but I don't see how the new requests will be called unless it goes through the entire loop again. 3. if one adds a request to rb in the handleResponses method, this wouldn't necessarily be called, namely in the event that none of the components override the distributedProcess method, and the loop only goes through once. 4. where can I learn more about the shard.purpose variable? Where in the component should this be set, if anywhere? I've taken a look at the wiki page, but if there is more documentation elsewhere please point me towards it. Thanks in advance, Mike
Re: /trunk ClassCastException with BaseTokenizerFactory
Ryan McKinley wrote: > Ahh, I see: > Tokenizer extends TokenStream > > So if this is going to break everything that implements TokenStream > rather then Tokenizer, it seems we should change the TokenizerFactory > API from: > public Tokenizer create( Reader input ) > rather then: > public TokenStream create( Reader input ); > > I would WAY rather have my compiler tell me something is wrong then > get an error and then find some documentation about the tokenizer. > > - - - - - > > Personally, I think lucene/solr just need to fess up and admit that > 2.9 is *not* totally back compatible. I don't think anyone contends that Lucene is totally backcompat - and insofarasthatgoes there is no way Solr totally is - . it exposes a lot of Lucene. We admit our breaks in this release in the back compat breaks section. There is no way we will release claiming total back compat. Not even in the realm of possibility. > No way is the Multireader change back-compatible! Personally, pure API wise - I think it was. Its a stickier issue on the possible more RAM usage - but too me, thats more of a Runtime change. Certain methods have always changed over time in their resource usage, and I think thats within back compat. This was a steep one to swallow though, I'll admit. Basically we just thought it was way worth it long term. And Hoss came up with some great ideas to help ease the possible pain. > > ryan > > > On Aug 21, 2009, at 11:39 AM, Yonik Seeley wrote: > >> On Fri, Aug 21, 2009 at 10:13 AM, Ryan McKinley >> wrote: >>> I'm fine upgrading, but it seems we should the 'back compatibility' >>> notice more explicit. >> >> Yeah... that should be fun for expert-use plugins in general. In >> Lucene-land, this is the release of the "break"... I think we've >> covered the changes reasonably well in our external APIs, but people >> can always use pretty much the full Lucene API when writing Solr >> plugins. >> >> I think we'll need to document that things in tags need to >> inherit from Tokenizer classes. It is technically a back-compat >> break, but I assume it will affect very few users? >> >> -Yonik >> http://www.lucidimagination.com > -- - Mark http://www.lucidimagination.com
Re: /trunk ClassCastException with BaseTokenizerFactory
On Fri, Aug 21, 2009 at 12:22 PM, Ryan McKinley wrote: > Ahh, I see: > Tokenizer extends TokenStream > > So if this is going to break everything that implements TokenStream rather > then Tokenizer, it seems we should change the TokenizerFactory API from: > public Tokenizer create( Reader input ) > rather then: > public TokenStream create( Reader input ); > > I would WAY rather have my compiler tell me something is wrong then get an > error and then find some documentation about the tokenizer. +1 Absolutely. -Yonik http://www.lucidimagination.com
Re: /trunk ClassCastException with BaseTokenizerFactory
Ahh, I see: Tokenizer extends TokenStream So if this is going to break everything that implements TokenStream rather then Tokenizer, it seems we should change the TokenizerFactory API from: public Tokenizer create( Reader input ) rather then: public TokenStream create( Reader input ); I would WAY rather have my compiler tell me something is wrong then get an error and then find some documentation about the tokenizer. - - - - - Personally, I think lucene/solr just need to fess up and admit that 2.9 is *not* totally back compatible. No way is the Multireader change back-compatible! ryan On Aug 21, 2009, at 11:39 AM, Yonik Seeley wrote: On Fri, Aug 21, 2009 at 10:13 AM, Ryan McKinley wrote: I'm fine upgrading, but it seems we should the 'back compatibility' notice more explicit. Yeah... that should be fun for expert-use plugins in general. In Lucene-land, this is the release of the "break"... I think we've covered the changes reasonably well in our external APIs, but people can always use pretty much the full Lucene API when writing Solr plugins. I think we'll need to document that things in tags need to inherit from Tokenizer classes. It is technically a back-compat break, but I assume it will affect very few users? -Yonik http://www.lucidimagination.com
[jira] Commented: (SOLR-1376) invalid links to solr indexes after a new index is created
[ https://issues.apache.org/jira/browse/SOLR-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746035#action_12746035 ] kiran sugana commented on SOLR-1376: Hi Hoss, By incremental indexing I meant commits,I do not know for sure list of deleted files grows over the time. we noticed the issue when solr became slow or unresponsive on the machine, which solr has not been restarted for a while. I will investigate if the deleted files list is growing. Kiran > invalid links to solr indexes after a new index is created > -- > > Key: SOLR-1376 > URL: https://issues.apache.org/jira/browse/SOLR-1376 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 1.3 >Reporter: kiran sugana > Fix For: 1.4 > > > After new index is created, it does not delete the links to the old indexes, > To recreate the issue, > 1) do a incremental indexing > 2) cd /proc/[JAVA_PID]/fd > 3) ls -la > {code} > lr-x-- 1 solr roleusers 64 Jul 23 17:31 75 -> > /home//solrhome/data/index/_kja.fdx (deleted) > lr-x-- 1 solr roleusers 64 Jul 23 17:31 76 -> > /home/./solrhome/data/index/_kk4.tis (deleted) > lr-x-- 1 solr roleusers 64 Jul 23 17:31 78 -> > /home//solrhome/data/index/_kk4.frq (deleted) > lr-x-- 1 solr roleusers 64 Jul 23 17:31 79 -> > /home//solrhome/data/index/_kk4.prx (deleted) > {code} > This is creating performance issues, (search slows down significantly) > Temp Resolution: > Restart solr -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: /trunk ClassCastException with BaseTokenizerFactory
On Fri, Aug 21, 2009 at 10:13 AM, Ryan McKinley wrote: > I'm fine upgrading, but it seems we should the 'back compatibility' > notice more explicit. Yeah... that should be fun for expert-use plugins in general. In Lucene-land, this is the release of the "break"... I think we've covered the changes reasonably well in our external APIs, but people can always use pretty much the full Lucene API when writing Solr plugins. I think we'll need to document that things in tags need to inherit from Tokenizer classes. It is technically a back-compat break, but I assume it will affect very few users? -Yonik http://www.lucidimagination.com
Re: /trunk ClassCastException with BaseTokenizerFactory
On Aug 21, 2009, at 10:49 AM, Yonik Seeley wrote: On Fri, Aug 21, 2009 at 10:33 AM, Ryan McKinley wrote: Actually I think there may be something wrong here. BaseTokenizerFactory does not make a Tokenizer, it creates a TokenStream, so it should never be cast to Tokenizer My custom TokenizerFactory now looks the same as: o.a.s.analysis.PatternTokenizerFactory Urg... looks like there's no end-to-end (index then search) test for PatternTokenizerFactory, so we never caught this. I guess we need to add one :) It seems like when something is specified as a in schema.xml it should in fact be a tokenizer - it's the only way tokenstream reuse works. I don't see anything in Solr that creates a Tokenizer. The TokenizerFactory just creates a TokenStream. It seems that TokenizerFactory really needs to be: public Tokenizer create( Reader input ) rather then: public TokenStream create( Reader input ); I don't see any backwards compatible way to make this change! ideas? ryan
Re: /trunk ClassCastException with BaseTokenizerFactory
On Fri, Aug 21, 2009 at 10:33 AM, Ryan McKinley wrote: > Actually I think there may be something wrong here. > > BaseTokenizerFactory does not make a Tokenizer, it creates a > TokenStream, so it should never be cast to Tokenizer > > My custom TokenizerFactory now looks the same as: > o.a.s.analysis.PatternTokenizerFactory Urg... looks like there's no end-to-end (index then search) test for PatternTokenizerFactory, so we never caught this. It seems like when something is specified as a in schema.xml it should in fact be a tokenizer - it's the only way tokenstream reuse works. -Yonik
Re: /trunk ClassCastException with BaseTokenizerFactory
Actually I think there may be something wrong here. BaseTokenizerFactory does not make a Tokenizer, it creates a TokenStream, so it should never be cast to Tokenizer My custom TokenizerFactory now looks the same as: o.a.s.analysis.PatternTokenizerFactory Not sure what to look at next... ideas? thanks ryan On Fri, Aug 21, 2009 at 10:13 AM, Ryan McKinley wrote: > Just updated to /trunk and am now seeing this exception: > > Caused by: org.apache.solr.client.solrj.SolrServerException: > java.lang.ClassCastException: > xxx.solr.analysis.JSONKeyValueTokenizerFactory$1 cannot be cast to > org.apache.lucene.analysis.Tokenizer > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:141) > ... 15 more > Caused by: java.lang.ClassCastException: > xxx.solr.analysis.JSONKeyValueTokenizerFactory$1 cannot be cast to > org.apache.lucene.analysis.Tokenizer > at > org.apache.solr.analysis.TokenizerChain.getStream(TokenizerChain.java:69) > at > org.apache.solr.analysis.SolrAnalyzer.reusableTokenStream(SolrAnalyzer.java:74) > at > org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer.reusableTokenStream(IndexSchema.java:364) > at > org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:124) > at > org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:244) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:772) > > > Looks like SolrIndexAnalyzer now assumes everything uses the new > TokenStream API... > > I'm fine upgrading, but it seems we should the 'back compatibility' > notice more explicit. > > > FYI, this is what the TokenizerFactory looks like: > > public class JSONKeyValueTokenizerFactory extends BaseTokenizerFactory > { > ... > > public TokenStream create(Reader input) { > final JSONParser js = new JSONParser( input ); > final Stack keystack = new Stack(); > > return new TokenStream() > { > ... >
/trunk ClassCastException with BaseTokenizerFactory
Just updated to /trunk and am now seeing this exception: Caused by: org.apache.solr.client.solrj.SolrServerException: java.lang.ClassCastException: xxx.solr.analysis.JSONKeyValueTokenizerFactory$1 cannot be cast to org.apache.lucene.analysis.Tokenizer at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:141) ... 15 more Caused by: java.lang.ClassCastException: xxx.solr.analysis.JSONKeyValueTokenizerFactory$1 cannot be cast to org.apache.lucene.analysis.Tokenizer at org.apache.solr.analysis.TokenizerChain.getStream(TokenizerChain.java:69) at org.apache.solr.analysis.SolrAnalyzer.reusableTokenStream(SolrAnalyzer.java:74) at org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer.reusableTokenStream(IndexSchema.java:364) at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:124) at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:244) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:772) Looks like SolrIndexAnalyzer now assumes everything uses the new TokenStream API... I'm fine upgrading, but it seems we should the 'back compatibility' notice more explicit. FYI, this is what the TokenizerFactory looks like: public class JSONKeyValueTokenizerFactory extends BaseTokenizerFactory { ... public TokenStream create(Reader input) { final JSONParser js = new JSONParser( input ); final Stack keystack = new Stack(); return new TokenStream() { ...
[jira] Updated: (SOLR-1275) Add expungeDeletes to DirectUpdateHandler2
[ https://issues.apache.org/jira/browse/SOLR-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-1275: --- Attachment: SOLR-1275.patch > Add expungeDeletes to DirectUpdateHandler2 > -- > > Key: SOLR-1275 > URL: https://issues.apache.org/jira/browse/SOLR-1275 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Jason Rutherglen >Assignee: Noble Paul >Priority: Trivial > Fix For: 1.4 > > Attachments: SOLR-1275.patch, SOLR-1275.patch, SOLR-1275.patch, > SOLR-1275.patch, SOLR-1275.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > expungeDeletes is a useful method somewhat like optimize is offered by > IndexWriter that can be implemented in DirectUpdateHandler2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1275) Add expungeDeletes to DirectUpdateHandler2
[ https://issues.apache.org/jira/browse/SOLR-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745952#action_12745952 ] Yonik Seeley commented on SOLR-1275: bq. Calling SR.undelete would remove the deletes and the test would pass? Simple to fix... check against the exact number of documents instead of checking that there are no deletes. > Add expungeDeletes to DirectUpdateHandler2 > -- > > Key: SOLR-1275 > URL: https://issues.apache.org/jira/browse/SOLR-1275 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Jason Rutherglen >Assignee: Noble Paul >Priority: Trivial > Fix For: 1.4 > > Attachments: SOLR-1275.patch, SOLR-1275.patch, SOLR-1275.patch, > SOLR-1275.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > expungeDeletes is a useful method somewhat like optimize is offered by > IndexWriter that can be implemented in DirectUpdateHandler2. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1335) load core properties from a properties file
[ https://issues.apache.org/jira/browse/SOLR-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1335: - Attachment: SOLR-1335.patch * The properties filename is configurable from solr.xml on a per-core basis * The testcase is cleaned up > load core properties from a properties file > --- > > Key: SOLR-1335 > URL: https://issues.apache.org/jira/browse/SOLR-1335 > Project: Solr > Issue Type: New Feature >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 1.4 > > Attachments: SOLR-1335.patch, SOLR-1335.patch, SOLR-1335.patch, > SOLR-1335.patch > > > There are few ways of loading properties in runtime, > # using env property using in the command line > # if you use a multicore drop it in the solr.xml > if not , the only way is to keep separate solrconfig.xml for each instance. > #1 is error prone if the user fails to start with the correct system > property. > In our case we have four different configurations for the same deployment . > And we have to disable replication of solrconfig.xml. > It would be nice if I can distribute four properties file so that our ops can > drop the right one and start Solr. Or it is possible for the operations to > edit a properties file but it is risky to edit solrconfig.xml if he does not > understand solr > I propose a properties file in the instancedir as solrcore.properties . If > present would be loaded and added as core specific properties. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Updated: (SOLR-1366) UnsupportedOperationException may be thrown when using custom IndexReader
On Fri, Aug 21, 2009 at 3:20 AM, Chris Hostetter wrote: > > : Shalin Shekhar Mangar updated SOLR-1366: > : > : > : Component/s: replication (java) > > the issue seems broader then just replication ... i would change this back > to a generic "search" component, and open new related issue(s) for > replication (documentation vs custom reader support) ... some pieces of > this may make it into 1.4 and some may not, so we'll want to track > seperately. > > I just added "replication" in addition to "search" to the components field so that this issue shows up against both. I'll open a new issue for the documentation updates. -- Regards, Shalin Shekhar Mangar.
Re: Deleting a field at runtime
It depends on which level you want to delete it from. If you just want solr to know nothing about the field, then you can remove it from the schema and reload the core (or restart solr). Technically the field will still exist in the lucene index, but if you're only accessing that index through solr, it will effectively not exist any more. (I think). T On 20 Aug 2009, at 18:20, KishoreVeleti CoreObjects wrote: Hi All, Just completed an interview on SOLR - one of the question was "is it possible to remove a field from existing index". I am not sure what is the business use case here. My understanding is it is not possible. Still wanted to know from SOLR experts, is it possible to remove a field from an existing index? Thanks in Advance, Kishore Veleti A.V.K. -- View this message in context: http://www.nabble.com/Deleting-a-field-at-runtime-tp25066329p25066329.html Sent from the Solr - Dev mailing list archive at Nabble.com. -- Toby Cole Software Engineer, Semantico Limited Registered in England and Wales no. 03841410, VAT no. GB-744614334. Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK. Check out all our latest news and thinking on the Discovery blog http://blogs.semantico.com/discovery-blog/
Build failed in Hudson: Solr-trunk #901
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/901/changes Changes: [yonik] SOLR-1368: add ms() and sub() functions [hossman] cleanup of comments relating to 'default' field values; cleanup of 'timestamp' usage examples -- switched to using 'manufacturedate_dt' as a generic date field example since yonik doens't want schema to have fields with default values uncommented [hossman] remove executable bit from csv file [hossman] SOLR-1373: Add Filter query to admin/form.jsp [hossman] SOLR-1371: LukeRequestHandler/schema.jsp errored if schema had no uniqueKey field. The new test for this also (hopefully) adds some future proofing against similar bugs in the future. As a side effect QueryElevationComponentTest was refactored, and a bug in that test was found. -- [...truncated 2204 lines...] [junit] Running org.apache.solr.analysis.TestStopFilterFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.057 sec [junit] Running org.apache.solr.analysis.TestSynonymFilter [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 6.419 sec [junit] Running org.apache.solr.analysis.TestSynonymMap [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 5.7 sec [junit] Running org.apache.solr.analysis.TestTrimFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.013 sec [junit] Running org.apache.solr.analysis.TestWordDelimiterFilter [junit] Tests run: 13, Failures: 0, Errors: 0, Time elapsed: 36.385 sec [junit] Running org.apache.solr.client.solrj.SolrExceptionTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.11 sec [junit] Running org.apache.solr.client.solrj.SolrQueryTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.664 sec [junit] Running org.apache.solr.client.solrj.TestBatchUpdate [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 23.244 sec [junit] Running org.apache.solr.client.solrj.TestLBHttpSolrServer [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 17.019 sec [junit] Running org.apache.solr.client.solrj.beans.TestDocumentObjectBinder [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.513 sec [junit] Running org.apache.solr.client.solrj.embedded.JettyWebappTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 18.176 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeBinaryJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.644 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.605 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.11 sec [junit] Running org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.239 sec [junit] Running org.apache.solr.client.solrj.embedded.MultiCoreEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.156 sec [junit] Running org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.541 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest [junit] Tests run: 8, Failures: 0, Errors: 1, Time elapsed: 19.198 sec [junit] Test org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest FAILED [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleJettyTest [junit] [junit] ERRORunknown_field_timestamp [junit] [junit] request: http://localhost:60348/example/update?wt=javabin&version=1) [junit] Tests run: 9, Failures: 0, Errors: 1, Time elapsed: 26.718 sec [junit] Test org.apache.solr.client.solrj.embedded.SolrExampleJettyTest FAILED [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleStreamingTest [junit] Tests run: 8, Failures: 1, Errors: 0, Time elapsed: 38.089 sec [junit] Test org.apache.solr.client.solrj.embedded.SolrExampleStreamingTest FAILED [junit] Running org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.263 sec [junit] Running org.apache.solr.client.solrj.request.TestUpdateRequestCodec [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.469 sec [junit] Running org.apache.solr.client.solrj.response.AnlysisResponseBaseTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.963 sec [junit] Running org.apache.solr.client.solrj.response.DocumentAnalysisResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.623 sec [junit] Running org.apache.solr.client.solrj.response.FieldAnalysisResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time e
Solr nightly build failure
init-forrest-entities: [mkdir] Created dir: /tmp/apache-solr-nightly/build [mkdir] Created dir: /tmp/apache-solr-nightly/build/web compile-solrj: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solrj [javac] Compiling 84 source files to /tmp/apache-solr-nightly/build/solrj [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compile: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solr [javac] Compiling 372 source files to /tmp/apache-solr-nightly/build/solr [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compileTests: [mkdir] Created dir: /tmp/apache-solr-nightly/build/tests [javac] Compiling 166 source files to /tmp/apache-solr-nightly/build/tests [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. junit: [mkdir] Created dir: /tmp/apache-solr-nightly/build/test-results [junit] Running org.apache.solr.BasicFunctionalityTest [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 43.438 sec [junit] Running org.apache.solr.ConvertedLegacyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 22.641 sec [junit] Running org.apache.solr.DisMaxRequestHandlerTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 21.906 sec [junit] Running org.apache.solr.EchoParamsTest [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 19.662 sec [junit] Running org.apache.solr.MinimalSchemaTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 12.82 sec [junit] Running org.apache.solr.OutputWriterTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 6.85 sec [junit] Running org.apache.solr.SampleTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 6.033 sec [junit] Running org.apache.solr.SolrInfoMBeanTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.165 sec [junit] Running org.apache.solr.TestDistributedSearch [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 107.777 sec [junit] Running org.apache.solr.TestTrie [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 17.618 sec [junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterFactoryTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.711 sec [junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterTest [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.734 sec [junit] Running org.apache.solr.analysis.EnglishPorterFilterFactoryTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.808 sec [junit] Running org.apache.solr.analysis.HTMLStripCharFilterTest [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 1.389 sec [junit] Running org.apache.solr.analysis.LengthFilterTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.57 sec [junit] Running org.apache.solr.analysis.SnowballPorterFilterFactoryTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.925 sec [junit] Running org.apache.solr.analysis.TestBufferedTokenStream [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.503 sec [junit] Running org.apache.solr.analysis.TestCapitalizationFilter [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.647 sec [junit] Running org.apache.solr.analysis.TestDelimitedPayloadTokenFilterFactory [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 8.684 sec [junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.581 sec [junit] Running org.apache.solr.analysis.TestKeepFilterFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.251 sec [junit] Running org.apache.solr.analysis.TestKeepWordFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.714 sec [junit] Running org.apache.solr.analysis.TestMappingCharFilterFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.786 sec [junit] Running org.apache.solr.analysis.TestPatternReplaceFilter [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 5.998 sec [junit] Running org.apache.solr.analysis.TestPatternTokenizerFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.164 sec [junit] Running org.apache.solr.analysis.Te