[jira] Commented: (SOLR-534) Return all query results with parameter rows=-1
[ https://issues.apache.org/jira/browse/SOLR-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832351#action_12832351 ] Walter Underwood commented on SOLR-534: --- -1 This adds a denial of service vulnerability to Solr. One query can use lots of CPU or memory, or even crash the server. This could also take out an entire distributed system. If this is added, we MUST add a config option to disable it. Let's take this back to the mailing list and find out why they believe all results are needed.There must be a better way to solve this. > Return all query results with parameter rows=-1 > --- > > Key: SOLR-534 > URL: https://issues.apache.org/jira/browse/SOLR-534 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 > Environment: Tomcat 5.5 >Reporter: Lars Kotthoff >Priority: Minor > Attachments: solr-all-results.patch > > > The searcher should return all results matching a query when the parameter > rows=-1 is given. > I know that it is a bad idea to do this in general, but as it explicitly > requires a special parameter, people using this feature will be aware of what > they are doing. The main use case for this feature is probably debugging, but > in some cases one might actually need to retrieve all results because they > e.g. are to be merged with results from different sources. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1216) disambiguate the replication command names
[ https://issues.apache.org/jira/browse/SOLR-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719625#action_12719625 ] Walter Underwood commented on SOLR-1216: If we choose a name for the thing we are pulling, like "image", then we can use "makeimage", "pullimage", etc. > disambiguate the replication command names > -- > > Key: SOLR-1216 > URL: https://issues.apache.org/jira/browse/SOLR-1216 > Project: Solr > Issue Type: Improvement > Components: replication (java) >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 1.4 > > Attachments: SOLR-1216.patch > > > There is a lot of confusion in the naming of various commands such as > snappull, snapshot etc. This is a vestige of the script based replication we > currently have. The commands can be renamed to make more sense > * 'snappull' to be renamed to 'sync' > * 'snapshot' to be renamed to 'backup' > thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1216) disambiguate the replication command names
[ https://issues.apache.org/jira/browse/SOLR-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719609#action_12719609 ] Walter Underwood commented on SOLR-1216: "sync" is a weak name, because it doesn't say whether it is a push or pull synchronization. > disambiguate the replication command names > -- > > Key: SOLR-1216 > URL: https://issues.apache.org/jira/browse/SOLR-1216 > Project: Solr > Issue Type: Improvement > Components: replication (java) >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 1.4 > > Attachments: SOLR-1216.patch > > > There is a lot of confusion in the naming of various commands such as > snappull, snapshot etc. This is a vestige of the script based replication we > currently have. The commands can be renamed to make more sense > * 'snappull' to be renamed to 'sync' > * 'snapshot' to be renamed to 'backup' > thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1073) StrField should allow locale sensitive sorting
[ https://issues.apache.org/jira/browse/SOLR-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703893#action_12703893 ] Walter Underwood commented on SOLR-1073: Using the locale of the JVM is very, very bad for a multilingual server. Solr should always use the same, simple locale. It is OK to set a Locale in configuration for single-language installations, but using the JVM locale is a recipe for disaster. You move Solr to a different server and everything breaks. Very, very bad. In a multi-lingual config, locales must be set per-request. Ideally, requests should send an ISO language code as context for the query. > StrField should allow locale sensitive sorting > -- > > Key: SOLR-1073 > URL: https://issues.apache.org/jira/browse/SOLR-1073 > Project: Solr > Issue Type: Improvement > Environment: All >Reporter: Sachin > Attachments: LocaleStrField.java > > > Currently, StrField does not take a parameter which it can pass to ctor of > SortField making the StrField's sorting rely on the locale of the JVM. > Ideally, StrField should allow setting the locale in the schema.xml and use > it to create a new instance of the SortField in getSortField() method, > something like: > snip: > public SortField getSortField(SchemaField field,boolean reverse) > { > ... > Locale locale = new Locale(lang,country); > return new SortField(field.getName(), locale, reverse); > } > More details about this issue here: > http://www.nabble.com/CJKAnalyzer-and-Chinese-Text-sort-td22374195.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1044) Use Hadoop RPC for inter Solr communication
[ https://issues.apache.org/jira/browse/SOLR-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678601#action_12678601 ] Walter Underwood commented on SOLR-1044: During the Oscars, the HTTP cache in front of our Solr farm had a 90% hit rate. I think a 10X reduction in server load is a testimony to the superiority of the HTTP approach. > Use Hadoop RPC for inter Solr communication > --- > > Key: SOLR-1044 > URL: https://issues.apache.org/jira/browse/SOLR-1044 > Project: Solr > Issue Type: New Feature > Components: search >Reporter: Noble Paul > > Solr uses http for distributed search . We can make it a whole lot faster if > we use an RPC mechanism which is more lightweight/efficient. > Hadoop RPC looks like a good candidate for this. > The implementation should just have one protocol. It should follow the Solr's > idiom of making remote calls . A uri + params +[optional stream(s)] . The > response can be a stream of bytes. > To make this work we must make the SolrServer implementation pluggable in > distributed search. Users should be able to choose between the current > CommonshttpSolrServer, or a HadoopRpcSolrServer . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-822) CharFilter - normalize characters before tokenizer
[ https://issues.apache.org/jira/browse/SOLR-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642188#action_12642188 ] Walter Underwood commented on SOLR-822: --- Yes, it should be in Lucene. LIke this: http://webui.sourcelabs.com/lucene/issues/1343 There are (at least) four kinds of character mapping: Unicode normalization from decomposed to composed forms (always safe). Unicode normalization from compatability forms to standard forms (may change the look, like fullwidth to halfwidth Latin). Language-specific normalization, like "oe" to "ö" (German-only). Mappings that improve search but are linguistically dodgy, like stripping accents and mapping katakana to hirigana. wunder > CharFilter - normalize characters before tokenizer > -- > > Key: SOLR-822 > URL: https://issues.apache.org/jira/browse/SOLR-822 > Project: Solr > Issue Type: New Feature > Components: Analysis >Reporter: Koji Sekiguchi >Priority: Minor > Attachments: character-normalization.JPG, sample_mapping_ja.txt, > SOLR-822.patch, SOLR-822.patch > > > A new plugin which can be placed in front of . > {code:xml} > positionIncrementGap="100" > > > mapping="mapping_ja.txt" /> > > words="stopwords.txt"/> > > > > {code} > can be multiple (chained). I'll post a JPEG file to show > character normalization sample soon. > MOTIVATION: > In Japan, there are two types of tokenizers -- N-gram (CJKTokenizer) and > Morphological Analyzer. > When we use morphological analyzer, because the analyzer uses Japanese > dictionary to detect terms, > we need to normalize characters before tokenization. > I'll post a patch soon, too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-815) Add new Japanese half-width/full-width normalizaton Filter and Factory
[ https://issues.apache.org/jira/browse/SOLR-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641071#action_12641071 ] Walter Underwood commented on SOLR-815: --- I looked it up, and even found a reason to do it the right way. Latin should be normalized to halfwidth (in the Latin-1 character space). Kana should be normalized to fullwidth. Normalizing Latin characters to fullwidth would mean you could not use the existing accent-stripping filters or probably any other filter that expected Latin-1, like synonyms. Normalizing to halfwidth makes the rest of Solr and Lucene work as expected. See section 12.5: http://www.unicode.org/versions/Unicode5.0.0/ch12.pdf The compatability forms (the ones we normalize away from) are int the Unicode range U+FF00 to U+FFEF. The correct mappings from those forms are in this doc: http://www.unicode.org/charts/PDF/UFF00.pdf Other charts are here: http://www.unicode.org/charts/ > Add new Japanese half-width/full-width normalizaton Filter and Factory > -- > > Key: SOLR-815 > URL: https://issues.apache.org/jira/browse/SOLR-815 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Todd Feak >Assignee: Koji Sekiguchi >Priority: Minor > Attachments: SOLR-815.patch > > > Japanese Katakana and Latin alphabet characters exist as both a "half-width" > and "full-width" version. This new Filter normalizes to the full-width > version to allow searching and indexing using both. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-815) Add new Japanese half-width/full-width normalizaton Filter and Factory
[ https://issues.apache.org/jira/browse/SOLR-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640609#action_12640609 ] Walter Underwood commented on SOLR-815: --- If I remember correctly, Latin characters should normalize to half-width, not full-width. > Add new Japanese half-width/full-width normalizaton Filter and Factory > -- > > Key: SOLR-815 > URL: https://issues.apache.org/jira/browse/SOLR-815 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Todd Feak >Priority: Minor > Attachments: SOLR-815.patch > > > Japanese Katakana and Latin alphabet characters exist as both a "half-width" > and "full-width" version. This new Filter normalizes to the full-width > version to allow searching and indexing using both. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-814) Add new Japanese Hiragana Filter and Factory
[ https://issues.apache.org/jira/browse/SOLR-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640605#action_12640605 ] Walter Underwood commented on SOLR-814: --- This seems like a bad idea. Hirigana and katakana are used quite differently in Japanese. They are not interchangeable. I was the engineer for Japanese support in Ultraseek for years and even visited our distributor there, but no one ever asked for this feature. They asked for a lot of things, but never this. It is very useful, maybe essential, to normalize full-width and half-width versions of hirigana, katakana, and ASCII. > Add new Japanese Hiragana Filter and Factory > > > Key: SOLR-814 > URL: https://issues.apache.org/jira/browse/SOLR-814 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Todd Feak >Priority: Minor > Attachments: SOLR-814.patch > > > Japanese Hiragana and Katakana character sets can be easily translated > between. This filter normalizes all Hiragana characters to their Katakana > counterpart, allowing for indexing and searching using either. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-777) backword match search, for domain search etc.
[ https://issues.apache.org/jira/browse/SOLR-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632489#action_12632489 ] Walter Underwood commented on SOLR-777: --- You don't need backwards matching for this, and it doesn't really do the right thing. Split the string on ".", reverse the list, and join successive sublists with ".". Don't index the length one list, since that is ".com", ".net", etc. Do the same processing at query time. This is a special analyzer. > backword match search, for domain search etc. > - > > Key: SOLR-777 > URL: https://issues.apache.org/jira/browse/SOLR-777 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Koji Sekiguchi >Priority: Minor > > There is a requirement for searching domains with backward match. For > example, using "apache.org" for a query string, www.apache.org, > lucene.apache.org could be returned. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-600) XML parser stops working under heavy load
[ https://issues.apache.org/jira/browse/SOLR-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605751#action_12605751 ] Walter Underwood commented on SOLR-600: --- It could also be a concurrency bug in Solr that shows up on the IBM JVM because the thread scheduler makes different decisions. > XML parser stops working under heavy load > - > > Key: SOLR-600 > URL: https://issues.apache.org/jira/browse/SOLR-600 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 > Environment: Linux 2.6.19.7-ss0 #4 SMP Wed Mar 12 02:56:42 GMT 2008 > x86_64 Intel(R) Xeon(R) CPU X5450 @ 3.00GHz GenuineIntel GNU/Linux > Tomcat 6.0.16 > SOLR nightly 16 Jun 2008, and versions prior > JRE 1.6.0 >Reporter: John Smith > > Under heavy load, the following is spat out for every update: > org.apache.solr.common.SolrException log > SEVERE: java.lang.NullPointerException > at java.util.AbstractList$SimpleListIterator.hasNext(Unknown Source) > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:225) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:66) > at > org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:196) > at > org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) > at java.lang.Thread.run(Thread.java:735) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567068#action_12567068 ] Walter Underwood commented on SOLR-127: --- Two reasons to do HTTP caching for Solr: First, Solr is HTTP and needs to implement that correctly. Second, caches are much harder to implement and test than the cache information in HTTP. HTTP caches already exist and are well tested, so the implementation cost is zero and deployment is very easy. The HTTP spec already covers which responses should be cached. A 400 response may only be cached if it includes explicit cache control headers which allow that. See RFC 2616. We are using a caching load balancer and caching in Apache front ends to Tomcat. We see an increase of more than 2X in the capacity of our search farm. I would recommend against Solr-specific cache information in the XML part of the responses. Distributed caching is extremely difficult to get right. Around 25% of the HTTP 1.1 spec is devoted to caching and there are still grey areas. > Make Solr more friendly to external HTTP caches > --- > > Key: SOLR-127 > URL: https://issues.apache.org/jira/browse/SOLR-127 > Project: Solr > Issue Type: Wish >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 1.3 > > Attachments: CacheUnitTest.patch, CacheUnitTest.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch > > > an offhand comment I saw recently reminded me of something that really bugged > me about the serach solution i used *before* Solr -- it didn't play nicely > with HTTP caches that might be sitting in front of it. > at the moment, Solr doesn't put in particularly usefull info in the HTTP > Response headers to aid in caching (ie: Last-Modified), responds to all HEAD > requests with a 400, and doesn't do anything special with If-Modified-Since. > t the very least, we can set a Last-Modified based on when the current > IndexReder was open (if not the Date on the IndexReader) and use the same > info to determing how to respond to If-Modified-Since requests. > (for the record, i think the reason this hasn't occured to me in the 2+ years > i've been using Solr, is because with the internal caching, i've yet to need > to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches
[ https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527694 ] Walter Underwood commented on SOLR-127: --- Last-modified does require monotonic time, but ETags are version stamps without any ordering. The indexVersion should be fine for an ETag. > Make Solr more friendly to external HTTP caches > --- > > Key: SOLR-127 > URL: https://issues.apache.org/jira/browse/SOLR-127 > Project: Solr > Issue Type: Wish >Reporter: Hoss Man > Attachments: HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, > HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch > > > an offhand comment I saw recently reminded me of something that really bugged > me about the serach solution i used *before* Solr -- it didn't play nicely > with HTTP caches that might be sitting in front of it. > at the moment, Solr doesn't put in particularly usefull info in the HTTP > Response headers to aid in caching (ie: Last-Modified), responds to all HEAD > requests with a 400, and doesn't do anything special with If-Modified-Since. > t the very least, we can set a Last-Modified based on when the current > IndexReder was open (if not the Date on the IndexReader) and use the same > info to determing how to respond to If-Modified-Since requests. > (for the record, i think the reason this hasn't occured to me in the 2+ years > i've been using Solr, is because with the internal caching, i've yet to need > to put a proxy cache in front of Solr) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-277) Character Entity of XHTML is not supported with XmlUpdateRequestHandler .
[ https://issues.apache.org/jira/browse/SOLR-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508408 ] Walter Underwood commented on SOLR-277: --- This is not a bug. Solr accepts XML, not XHTML. It does not accept XHTML-only entities. The Solr update XML format is a specific Solr XML format, not XML, not DocBook, not anything else. To index XHTML, parse it and convert it to Solr XML update format. > Character Entity of XHTML is not supported with XmlUpdateRequestHandler . > - > > Key: SOLR-277 > URL: https://issues.apache.org/jira/browse/SOLR-277 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.3 >Reporter: Toru Matsuzawa > Attachments: XmlUpdateRequestHandler.patch > > > Character Entity of XHTML is not supported with XmlUpdateRequestHandler . > http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent > http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent > http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent > It is necessary to correspond with XmlUpdateRequestHandler because xpp3 > cannot use . > I think it is necessary until StaxUpdateRequestHandler becomes "/update". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-216) Improvements to solr.py
[ https://issues.apache.org/jira/browse/SOLR-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499923 ] Walter Underwood commented on SOLR-216: --- GET is the right semantic for a query, since it doesn't change the resource. It also allows HTTP caching. If Solr has URL length limits, that's a bug. > Improvements to solr.py > --- > > Key: SOLR-216 > URL: https://issues.apache.org/jira/browse/SOLR-216 > Project: Solr > Issue Type: Improvement > Components: clients - python >Affects Versions: 1.2 >Reporter: Jason Cater >Assignee: Mike Klaas >Priority: Trivial > Attachments: solr.py > > > I've taken the original solr.py code and extended it to include higher-level > functions. > * Requires python 2.3+ > * Supports SSL (https://) schema > * Conforms (mostly) to PEP 8 -- the Python Style Guide > * Provides a high-level results object with implicit data type conversion > * Supports batching of update commands -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-208) RSS feed XSL example
[ https://issues.apache.org/jira/browse/SOLR-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496624 ] Walter Underwood commented on SOLR-208: --- I wasn't in the RSS wars, either, but I was on the Atom working group. That was a bunch of volunteers making a clean, testable spec for RSS functionality (http://www.ietf.org/rfc/rfc4287). RSS 2.0 has some bad ambiguities, especially around ampersand and entities in titles. The default has changed over the years and clients do different, incompatible things. GData is just a way to do search result stuff that we would need anyway. It is standard set of URL parameters for query, start-index, and categories, and a few Atom extensions for total results, items per page, and next/previous. http://code.google.com/apis/gdata/reference.html > RSS feed XSL example > > > Key: SOLR-208 > URL: https://issues.apache.org/jira/browse/SOLR-208 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 1.2 >Reporter: Brian Whitman > Assigned To: Hoss Man >Priority: Trivial > Attachments: rss.xsl > > > A quick .xsl file for transforming solr queries into RSS feeds. To get the > date and time in properly you'll need an XSL 2.0 processor, as in > http://wiki.apache.org/solr/XsltResponseWriter . Tested to work with the > example solr distribution in the nightly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-208) RSS feed XSL example
[ https://issues.apache.org/jira/browse/SOLR-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496608 ] Walter Underwood commented on SOLR-208: --- What kind of RSS? -1 unless it is Atom. The nine variants of RSS have some nasty interop problems, even between those that are supposed to implement the same spec. Even better, a GData interface returning Atom. > RSS feed XSL example > > > Key: SOLR-208 > URL: https://issues.apache.org/jira/browse/SOLR-208 > Project: Solr > Issue Type: New Feature > Components: clients - java >Affects Versions: 1.2 >Reporter: Brian Whitman > Assigned To: Hoss Man >Priority: Trivial > Attachments: rss.xsl > > > A quick .xsl file for transforming solr queries into RSS feeds. To get the > date and time in properly you'll need an XSL 2.0 processor, as in > http://wiki.apache.org/solr/XsltResponseWriter . Tested to work with the > example solr distribution in the nightly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-161) Dangling dash causes stack trace
[ https://issues.apache.org/jira/browse/SOLR-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473628 ] Walter Underwood commented on SOLR-161: --- It is really a Lucene query parser bug, but it wouldn't hurt to do s/(.*)-/&/ as a workaround. Assuming my ed(1) syntax is still fresh. Regardless, no query string should ever give a stack trace. --wunder > Dangling dash causes stack trace > > > Key: SOLR-161 > URL: https://issues.apache.org/jira/browse/SOLR-161 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.1.0 > Environment: Java 1.5, Tomcat 5.5.17, Fedora Core 4, Intel >Reporter: Walter Underwood > > I'm running tests from our search logs, and we have a query that ends in a > dash. That caused a stack trace. > org.apache.lucene.queryParser.ParseException: Cannot parse 'digging for the > truth -': Encountered "" at line 1, column 23. > Was expecting one of: > "(" ... > ... > ... > ... > ... > "[" ... > "{" ... > ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:127) > at > org.apache.solr.request.DisMaxRequestHandler.handleRequest(DisMaxRequestHandler.java:272) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:595) > at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:92) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-161) Dangling dash causes stack trace
[ https://issues.apache.org/jira/browse/SOLR-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473625 ] Walter Underwood commented on SOLR-161: --- The parser can have a rule for this rather than exploding. A trailing dash is never meaningful and can be omitted, whether we're allowing +/- or not. Seems like a grammar bug to me. --wunder > Dangling dash causes stack trace > > > Key: SOLR-161 > URL: https://issues.apache.org/jira/browse/SOLR-161 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.1.0 > Environment: Java 1.5, Tomcat 5.5.17, Fedora Core 4, Intel >Reporter: Walter Underwood > > I'm running tests from our search logs, and we have a query that ends in a > dash. That caused a stack trace. > org.apache.lucene.queryParser.ParseException: Cannot parse 'digging for the > truth -': Encountered "" at line 1, column 23. > Was expecting one of: > "(" ... > ... > ... > ... > ... > "[" ... > "{" ... > ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:127) > at > org.apache.solr.request.DisMaxRequestHandler.handleRequest(DisMaxRequestHandler.java:272) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:595) > at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:92) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-161) Dangling dash causes stack trace
Dangling dash causes stack trace Key: SOLR-161 URL: https://issues.apache.org/jira/browse/SOLR-161 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.1.0 Environment: Java 1.5, Tomcat 5.5.17, Fedora Core 4, Intel Reporter: Walter Underwood I'm running tests from our search logs, and we have a query that ends in a dash. That caused a stack trace. org.apache.lucene.queryParser.ParseException: Cannot parse 'digging for the truth -': Encountered "" at line 1, column 23. Was expecting one of: "(" ... ... ... ... ... "[" ... "{" ... ... at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:127) at org.apache.solr.request.DisMaxRequestHandler.handleRequest(DisMaxRequestHandler.java:272) at org.apache.solr.core.SolrCore.execute(SolrCore.java:595) at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:92) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-129) Solrb - UTF 8 Support for add/delete
[ https://issues.apache.org/jira/browse/SOLR-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469072 ] Walter Underwood commented on SOLR-129: --- This is not a bug, unless a bad error message is a bug. It looks like the XML uses the HTML entity "å" , which is not defined in XML. It has nothing to do with UTF-8. It really should generate an error message with line number instead of a stack trace. wunder > Solrb - UTF 8 Support for add/delete > > > Key: SOLR-129 > URL: https://issues.apache.org/jira/browse/SOLR-129 > Project: Solr > Issue Type: Bug > Components: clients - ruby - flare > Environment: OSX >Reporter: Antonio Eggberg > > Hi: > This could be a ruby utf-8 bug. Anyway when I try to do a UTF-8 document add > via post.sh and then do query via Solr Admin everything works as it should. > However using the solrb ruby lib or flare UTF-8 doc add doesn't work as it > should. I am not sure what I am doing wrong and I don't think its Solr cos it > works as it should. > Could this be a famous utf-8 ruby bug? I am using ruby 1.8.5 with rails 1.2.1 > Cheers -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-73) schema.xml and solrconfig.xml use CNET-internal class names
[ http://issues.apache.org/jira/browse/SOLR-73?page=comments#action_12455684 ] Walter Underwood commented on SOLR-73: -- Remember, this bug is only about removing aliased names from the sample files. Note that the users in favor of having a alias-free sample files are all new to Solr. The people in favor of keeping them are generally long-time Solr users or developers. From a new user point of view, they are confusing. Adding explicit alias definitions is a separate issue. > schema.xml and solrconfig.xml use CNET-internal class names > --- > > Key: SOLR-73 > URL: http://issues.apache.org/jira/browse/SOLR-73 > Project: Solr > Issue Type: Bug > Components: search >Reporter: Walter Underwood > > The configuration files in the example directory still use the old > CNET-internal class names, like solr.LRUCache instead of > org.apache.solr.search.LRUCache. This is confusing to new users and should > be fixed before the first release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-73) schema.xml and solrconfig.xml use CNET-internal class names
[ http://issues.apache.org/jira/browse/SOLR-73?page=comments#action_12454190 ] Walter Underwood commented on SOLR-73: -- The context required to resolve the ambiguity is a wiki page that I didn't know existed. Since I didn't know about it, I tried to figure it out by reading the code, and then by sending e-mail to the list. In my case, I was writing two tiny classes, but the issue would be the same if I was a non-programmer adding some simple plug-ins. With a full class name, there is no ambiguity. Again, this saves typing at the cost of requiring an indirection through some unspecified documentation. I saw every customer support e-mail for eight years with Ultraseek, so I'm pretty familiar with the problems that search engine admins run into. One of the things we learned was that documentation doesn't fix an unclear product. You fix the product instead of documenting how to understand it. Requiring users to edit an XML file is a separate issue, but I think it is a serious problem, especially because any error messages show up in the server logs. > schema.xml and solrconfig.xml use CNET-internal class names > --- > > Key: SOLR-73 > URL: http://issues.apache.org/jira/browse/SOLR-73 > Project: Solr > Issue Type: Bug > Components: search >Reporter: Walter Underwood > > The configuration files in the example directory still use the old > CNET-internal class names, like solr.LRUCache instead of > org.apache.solr.search.LRUCache. This is confusing to new users and should > be fixed before the first release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-73) schema.xml and solrconfig.xml use CNET-internal class names
[ http://issues.apache.org/jira/browse/SOLR-73?page=comments#action_12454159 ] Walter Underwood commented on SOLR-73: -- I think the aliases are harder to read. You need to go elsewhere to figure them out. I read documentation, but I didn't find the part of the wiki that explained them and I had to ask the mailing list. The javadoc uses the full class name. Google and Yahoo searches should work better with the full class name (Yahoo is working much better than Google for that right now). The aliases save typing, but I don't think they improve usability. Full class names are simple and unambiguous. If we want usability for non-programmers, we can't have them editing an XML file. > schema.xml and solrconfig.xml use CNET-internal class names > --- > > Key: SOLR-73 > URL: http://issues.apache.org/jira/browse/SOLR-73 > Project: Solr > Issue Type: Bug > Components: search >Reporter: Walter Underwood > > The configuration files in the example directory still use the old > CNET-internal class names, like solr.LRUCache instead of > org.apache.solr.search.LRUCache. This is confusing to new users and should > be fixed before the first release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-73) schema.xml and solrconfig.xml use CNET-internal class names
[ http://issues.apache.org/jira/browse/SOLR-73?page=comments#action_12454066 ] Walter Underwood commented on SOLR-73: -- The aliasing requires documentation and using the full class names doesn't. It seems much simpler to me to use the real class names. Less to maintain, less to test, less to explain. > schema.xml and solrconfig.xml use CNET-internal class names > --- > > Key: SOLR-73 > URL: http://issues.apache.org/jira/browse/SOLR-73 > Project: Solr > Issue Type: Bug > Components: search >Reporter: Walter Underwood > > The configuration files in the example directory still use the old > CNET-internal class names, like solr.LRUCache instead of > org.apache.solr.search.LRUCache. This is confusing to new users and should > be fixed before the first release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-73) schema.xml and solrconfig.xml use CNET-internal class names
schema.xml and solrconfig.xml use CNET-internal class names --- Key: SOLR-73 URL: http://issues.apache.org/jira/browse/SOLR-73 Project: Solr Issue Type: Bug Components: search Reporter: Walter Underwood The configuration files in the example directory still use the old CNET-internal class names, like solr.LRUCache instead of org.apache.solr.search.LRUCache. This is confusing to new users and should be fixed before the first release. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira