[jira] Resolved: (SOLR-726) driver and datasources are not loaded using the multicore lib aware SolrResourceLoader

2008-08-29 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-726.


Resolution: Fixed

Committed revision 690131.

Thanks Walter and Noble!

 driver and datasources are not loaded using the multicore lib aware 
 SolrResourceLoader
 --

 Key: SOLR-726
 URL: https://issues.apache.org/jira/browse/SOLR-726
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Walter Ferrara
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-726.patch, SOLR-726.patch


 see 
 http://www.nabble.com/dataimporthandler-and-mysql-connector-jar-td19146229.html
 The jar containing the (jdbc) driver have to be present in the java 
 classpath. Putting it in coreX/lib or in the shared lib dir of a multicore 
 solr doesn't work

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-728) Add synchronization to avoid race condition of multiple imports working concurrently

2008-08-29 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-728:
---

Fix Version/s: (was: 1.3)
   1.4

Unmarking this for 1.3 -- the changes may be too invasive. We shall fix this in 
the next release.

 Add synchronization to avoid race condition of multiple imports working 
 concurrently
 

 Key: SOLR-728
 URL: https://issues.apache.org/jira/browse/SOLR-728
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Walter Ferrara
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


 see 
 http://www.nabble.com/dataimporthandler-and-multiple-delta-import-td19160129.html
 DataimportHandler import command should check if status is not idle, to avoid 
 race conditions

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Updated: (SOLR-726) driver and datasources are not loaded using the multicore lib aware SolrResourceLoader

2008-08-29 Thread Grant Ingersoll

I solved this using:

http://www.kfu.com/~nsayer/Java/dyn-jdbc.html

-Grant
On Aug 29, 2008, at 1:04 AM, Shalin Shekhar Mangar (JIRA) wrote:



[ https://issues.apache.org/jira/browse/SOLR-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
 ]


Shalin Shekhar Mangar updated SOLR-726:
---

   Attachment: SOLR-726.patch

A hackish work around to the class loader issue.

This patch tries to use SolrResourceLoader#findClass to load the  
Driver class. It tries to use DriverManager#getConnection. If that  
fails, we try to instantiate the Driver class and use the  
Driver#connect method directly bypassing the DriverManager. This  
workaround is documented in the JdbcDataSource class.


I will commit this shortly.

driver and datasources are not loaded using the multicore lib aware  
SolrResourceLoader

--

   Key: SOLR-726
   URL: https://issues.apache.org/jira/browse/SOLR-726
   Project: Solr
Issue Type: Bug
Components: contrib - DataImportHandler
  Affects Versions: 1.3
  Reporter: Walter Ferrara
  Assignee: Shalin Shekhar Mangar
  Priority: Minor
   Fix For: 1.3

   Attachments: SOLR-726.patch, SOLR-726.patch


see 
http://www.nabble.com/dataimporthandler-and-mysql-connector-jar-td19146229.html
The jar containing the (jdbc) driver have to be present in the java  
classpath. Putting it in coreX/lib or in the shared lib dir of a  
multicore solr doesn't work


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.





[jira] Created: (SOLR-739) Add support for OmitTf

2008-08-29 Thread Mark Miller (JIRA)
Add support for OmitTf
--

 Key: SOLR-739
 URL: https://issues.apache.org/jira/browse/SOLR-739
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.4


Allow setting omitTf in the field schema. Default to true for all but text 
fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-739) Add support for OmitTf

2008-08-29 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-739:
-

Attachment: SOLR-739.patch

Simple patch, but my first look at Schema stuff so merits a bit of scrutiny.

 Add support for OmitTf
 --

 Key: SOLR-739
 URL: https://issues.apache.org/jira/browse/SOLR-739
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-739.patch


 Allow setting omitTf in the field schema. Default to true for all but text 
 fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: solr-specific-forrest-variables.ent

2008-08-29 Thread Grant Ingersoll

Clearing it out did the trick.  Thanks!

On Aug 28, 2008, at 5:54 PM, Chris Hostetter wrote:



: How does Forrest pick them up from siteconf.xml?  The reference in  
there seems

: to be to a local file and not the one in build/.

catalog.xcat refers to the file using a relative path that goes all  
the

way up and into the build dir...

http://svn.apache.org/viewvc/lucene/solr/trunk/src/site/src/documentation/resources/schema/catalog.xcat

:  less build/solr-specific-forrest-variables.ent
:
:   !ENTITY solr.specversion 1.3.0.2008.08.28.17.11.58
:
:
: But after running Forrest in the site dir, tutorial.html looks like:
...
: +  This document is for Apache Solr version  
1.2.2008.08.28.14.55.52.  If
: you are using a different version of Solr, please consult the  
documentation

: that was distributed with the version you are using.
:
: So, it's getting today's date, but not the right version.

Even the date isn't right actually (note that the hours, minutes, and
seconds are wrong)

: It also isn't generating tutorial.pdf which seems weird.

i think something maybe wonky about the timestamps on your files and  
what
forrest thiks is up to date ... try blowing away the forrest  
build dir
in src/site/src and then following the steps and see what happens  
(on the
trunk i'm seeing it pull in solr-specific-forrest-variables.ent  
correctly)


NOTE: when doing releases, you'll want to use -Dversion=X.Y.M
-Dspecversion=X.Y.M on all the ant command line calls so that the  
spect
version gets filled in with an official version number and not the  
long

ugly dev version that includes the date like you're seeing here. (it's
spelled out on the HowToRelease page)


-Hoss





[jira] Closed: (SOLR-738) Facet counts over non-linear date intervals

2008-08-29 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher closed SOLR-738.
-

Resolution: Fixed

Please use the solr-user list for usage questions.

Give fact.query a try using range queries, though :)

 Facet counts over non-linear date intervals
 ---

 Key: SOLR-738
 URL: https://issues.apache.org/jira/browse/SOLR-738
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
Reporter: screen
 Fix For: 1.2


 Hi All,
 How can I generate  facet counts over non-linear date intervals, for instance 
 obtaining results from: Last 7 days, Last 30 days, Last 90 days, Last 
 6 months ... ?
 I have tried everything mentioned in the docs but could not accomplish the 
 above. Using Solr 1.2 with PhpSolrClient on PHP 5.25.
 Please help...
 TIA

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-725) CoreContainer/CoreDescriptor/SolrCore cleansing

2008-08-29 Thread Henri Biestro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12626993#action_12626993
 ] 

Henri Biestro commented on SOLR-725:


Paul
Would it be fair to say that you fear the alias/hardlink feature would allow 
users to make configuration/manipulation mistakes more easily wrt replication?

As is, it does not remove any feature nor forces anyone into using them; thus, 
it's not breaking anything nor does it make your use-cases more difficult. It 
might be used in a wrong way and I'm not arguing that since it creates 
possibility and more choices, it can lead to more mistakes. And in that sense, 
some users could end up not being able to use the feature you contribute. I do 
believe though that it's better to describe  educate on best practices than 
constrain usage.

I also understand that for solr-727/solr-561, you need some URLs to be stable 
(which is what the cool uris dont change motto advocates and this is a good 
rule). Allowing more ways to alias a core is an easier path (no pun intended) 
to this than constraining users into having just one. I can even dedicate a URL 
to replication that is not something my end-users would ever need to know 
(since I dont think my deployment constraints or choices should reflect into 
what they use).

Aliasing (the hardlink model) is not adverse to replication usage conventions  
needs, it instead does allow to respect them more easily with more flexibility.
Just a different Solr user  contributor opinion.


 CoreContainer/CoreDescriptor/SolrCore cleansing
 ---

 Key: SOLR-725
 URL: https://issues.apache.org/jira/browse/SOLR-725
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Henri Biestro
 Attachments: solr-725.patch, solr-725.patch, solr-725.patch


 These 3 classes and the name vs alias handling are somewhat confusing.
 The recent SOLR-647  SOLR-716 have created a bit of a flux.
 This issue attemps to clarify the model and the list of operations. 
 h3. CoreDescriptor: describes the parameters of a SolrCore
 h4. Definitions
 * has one name
   ** The CoreDescriptor name may represent multiple aliases; in that 
 case, first alias is the SolrCore name
 * has one instance directory location
 * has one config  schema name
 h4. Operations
 The class is only a parameter passing facility
 h3. SolrCore: manages a Lucene index
 h4. Definitions
 * has one unique *name* (in the CoreContainer)
 **the *name* is used in JMX to identify the core
 * has one current set of *aliases*
 **the name is the first alias
 h4. Name  alias operations
 * *get name/aliases*: obvious
 * *alias*: adds an alias to this SolrCore
 * *unalias*: removes an alias from this SolrCore
 * *name*: sets the SolrCore name
 **potentially impacts JMX registration
 * *rename*: picks a new name from the SolrCore aliases
 **triggered when alias name is already in use
 h3. CoreContainer: manages all relations between cores  descriptors
 h4. Definitions
 * has a set of aliases (each of them pointing to one core)
 **ensure alias uniqueness.
 h4. SolrCore instance operations
 * *load*: makes a SolrCore available for requests
 **creates a SolrCore
 **registers all SolrCore aliases in the aliases set
 **(load = create + register)
 * *unload*: removes a core idenitified by one of its aliases
 **stops handling the Lucene index
 **all SolrCore aliases are removed
 * *reload*: recreate the core identified by one of its aliases
 * *create*: create a core from a CoreDescriptor
 **readies up the Lucene index
 * *register*: registers all aliases of a SolrCore
   
 h4. SolrCore  alias operations
 * *swap*: swaps 2 aliases
 **method: swap
 * *alias*: creates 1 alias for a core, potentially unaliasing a 
 previously used alias
 **The SolrCore name being an alias, this operation might trigger 
 a SolrCore rename
 * *unalias*: removes 1 alias for a core
 **The SolrCore name being an alias, this operation might trigger 
 a SolrCore rename
 *  *rename*: renames a core
 h3. CoreAdminHandler: handles CoreContainer operations
 * *load*/*create*:  CoreContainer load
 * *unload*:  CoreContainer unload
 * *reload*: CoreContainer reload
 * *swap*:  CoreContainer swap
 * *alias*:  CoreContainer alias
 * *unalias*: CoreContainer unalias
 *  *rename*: CoreContainer rename
 * *persist*: CoreContainer persist, writes the solr.xml
 **stauts*: returns the status of all/one SolrCore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-731) CoreDescriptor.getCoreContainer should not be public

2008-08-29 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627007#action_12627007
 ] 

Otis Gospodnetic commented on SOLR-731:
---

I've just started looking at CoreDescriptor/CoreContainer.  The two (plus 
SolrCore) seem quite a bit intertwined, so I think removing the CD - CC 
reference sounds like a simplification.

I don't understand the benefit of removing the SC - CD reference though, and 
on-the-fly reconstruction of core's CD in that describe method.


 CoreDescriptor.getCoreContainer should not be public
 

 Key: SOLR-731
 URL: https://issues.apache.org/jira/browse/SOLR-731
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Henri Biestro
 Attachments: solr-731.patch


 For the very same reasons that CoreDescriptor.getCoreProperties did not need 
 to be public (aka SOLR-724)
 It also means the CoreDescriptor ctor should not need a CoreContainer
 The CoreDescriptor is only meant to be describing a to-be created SolrCore.
 However, we need access to the CoreContainer from the SolrCore now that we 
 are guaranteed the CoreContainer always exists.
 This is also a natural consequence of SOLR-647 now that the CoreContainer is 
 not a map of CoreDescriptor but a map of SolrCore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: prototype Solr 1.3 RC 1

2008-08-29 Thread Steven A Rowe
Random nit: in release-candidate/build/docs/who.pdf, Otis's name is spelled 
Otis Gospodneti# (final character is a hash mark). - Steve

On 08/29/2008 at 11:16 AM, Grant Ingersoll wrote:
 I created a Hudson task to do the building/archival tasks for the
 release candidates.It is a on-demand task (i.e. not scheduled)
 
 See http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
 didate/ for the job in general.
 
 The artifacts (including Maven) are at:
 http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
 didate/lastSuccessfulBuild/artifact/
 
 The web site (including javadocs):
 http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
 didate/lastSuccessfulBuild/artifact/release-candidate/build/do
 cs/index.html
 
 I haven't gone through with a fine tooth comb yet, hence the
 prototype in the subject line, but my preliminary skimming of it
 seems like it is on track.   I will cover it more later
 today.  In the
 meantime, feedback is appreciated.
 
 Cheers,
 Grant


 



Re: svn commit: r689978 - in /lucene/solr/trunk: CHANGES.txt src/java/org/apache/solr/search/SolrQueryParser.java src/java/org/apache/solr/search/WildcardFilter.java src/test/org/apache/solr/Converted

2008-08-29 Thread Chris Hostetter

Are we really sure we want to do this w/o making it configurable on 
the QParser? (ala: SOLR-218)

Unless I'm missing something this change breaks back compatibility of for 
users who highlight wildcard queries.  As i recall: we even have users who 
force their prefix queries to be wildcards by using Hippo?* instead of 
Hippo* just so they can get highlighting.

+0


: Date: Thu, 28 Aug 2008 20:54:25 -
: From: [EMAIL PROTECTED]
: Reply-To: solr-dev@lucene.apache.org
: To: [EMAIL PROTECTED]
: Subject: svn commit: r689978 - in /lucene/solr/trunk: CHANGES.txt
: src/java/org/apache/solr/search/SolrQueryParser.java
: src/java/org/apache/solr/search/WildcardFilter.java
: src/test/org/apache/solr/ConvertedLegacyTest.java
: 
: Author: yonik
: Date: Thu Aug 28 13:54:24 2008
: New Revision: 689978
: 
: URL: http://svn.apache.org/viewvc?rev=689978view=rev
: Log:
: SOLR-737: use a constant score query for wildcards
: 
: Added:
: lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java   
(with props)
: Modified:
: lucene/solr/trunk/CHANGES.txt
: lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java
: lucene/solr/trunk/src/test/org/apache/solr/ConvertedLegacyTest.java
: 
: Modified: lucene/solr/trunk/CHANGES.txt
: URL: 
http://svn.apache.org/viewvc/lucene/solr/trunk/CHANGES.txt?rev=689978r1=689977r2=689978view=diff
: ==
: --- lucene/solr/trunk/CHANGES.txt (original)
: +++ lucene/solr/trunk/CHANGES.txt Thu Aug 28 13:54:24 2008
: @@ -396,6 +396,10 @@
:  
:   3. SOLR-647: reference count the SolrCore uses to prevent a premature
:  close while a core is still in use.  (Henri Biestro, Noble Paul, yonik)
: +
: + 4. SOLR-737: SolrQueryParser now uses a ConstantScoreQuery for wildcard
: +queries that prevent an exception from being thrown when the number
: +of matching terms exceeds the BooleanQuery clause limit.  (yonik)
:  
:  Optimizations
:   1. SOLR-276: improve JSON writer speed. (yonik)
: 
: Modified: 
lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java
: URL: 
http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java?rev=689978r1=689977r2=689978view=diff
: ==
: --- lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java 
(original)
: +++ lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java 
Thu Aug 28 13:54:24 2008
: @@ -22,6 +22,8 @@
:  import org.apache.lucene.queryParser.QueryParser;
:  import org.apache.lucene.search.ConstantScoreRangeQuery;
:  import org.apache.lucene.search.Query;
: +import org.apache.lucene.search.WildcardQuery;
: +import org.apache.lucene.search.ConstantScoreQuery;
:  import org.apache.lucene.analysis.Analyzer;
:  import org.apache.solr.common.SolrException;
:  import org.apache.solr.schema.FieldType;
: @@ -144,4 +146,12 @@
:  return new ConstantScorePrefixQuery(t);
:}
:  
: +  protected Query getWildcardQuery(String field, String termStr) throws 
ParseException {
: +Query q = super.getWildcardQuery(field, termStr);
: +if (q instanceof WildcardQuery) {
: +  // use a constant score query to avoid overflowing clauses
: +  return new ConstantScoreQuery(new 
WildcardFilter(((WildcardQuery)q).getTerm()));
: +}
: +return q;
: +  }
:  }
: 
: Added: lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java
: URL: 
http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java?rev=689978view=auto
: ==
: --- lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java 
(added)
: +++ lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java Thu 
Aug 28 13:54:24 2008
: @@ -0,0 +1,103 @@
: +/**
: + * Licensed to the Apache Software Foundation (ASF) under one or more
: + * contributor license agreements.  See the NOTICE file distributed with
: + * this work for additional information regarding copyright ownership.
: + * The ASF licenses this file to You under the Apache License, Version 2.0
: + * (the License); you may not use this file except in compliance with
: + * the License.  You may obtain a copy of the License at
: + *
: + * http://www.apache.org/licenses/LICENSE-2.0
: + *
: + * Unless required by applicable law or agreed to in writing, software
: + * distributed under the License is distributed on an AS IS BASIS,
: + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
: + * See the License for the specific language governing permissions and
: + * limitations under the License.
: + */
: +
: +package org.apache.solr.search;
: +
: +import org.apache.lucene.search.Filter;
: +import org.apache.lucene.search.DocIdSet;
: +import org.apache.lucene.search.WildcardTermEnum;
: +import 

Re: svn commit: r689978 - in /lucene/solr/trunk: CHANGES.txt src/java/org/apache/solr/search/SolrQueryParser.java src/java/org/apache/solr/search/WildcardFilter.java src/test/org/apache/solr/Converted

2008-08-29 Thread Yonik Seeley
On Fri, Aug 29, 2008 at 12:18 PM, Chris Hostetter
[EMAIL PROTECTED] wrote:
 Are we really sure we want to do this w/o making it configurable on
 the QParser? (ala: SOLR-218)

I thought about that, but these expanding term queries that have no
bounds are really broken... people who depend on them are playing with
fire.  Imagine everything working fine for months, docs being
occasionally added, until *boom* a magic limit is hit.

I wouldn't be opposed to a config option I guess... but I think the
default should be something that doesn't unpredictably break (so it
still wouldn't be 100% back compatible).

IMO, Highlighting really needs to be fixed to go through a different
mechanism than extractTerms().

-Yonik

 Unless I'm missing something this change breaks back compatibility of for
 users who highlight wildcard queries.  As i recall: we even have users who
 force their prefix queries to be wildcards by using Hippo?* instead of
 Hippo* just so they can get highlighting.

 +0


 : Date: Thu, 28 Aug 2008 20:54:25 -
 : From: [EMAIL PROTECTED]
 : Reply-To: solr-dev@lucene.apache.org
 : To: [EMAIL PROTECTED]
 : Subject: svn commit: r689978 - in /lucene/solr/trunk: CHANGES.txt
 : src/java/org/apache/solr/search/SolrQueryParser.java
 : src/java/org/apache/solr/search/WildcardFilter.java
 : src/test/org/apache/solr/ConvertedLegacyTest.java
 :
 : Author: yonik
 : Date: Thu Aug 28 13:54:24 2008
 : New Revision: 689978
 :
 : URL: http://svn.apache.org/viewvc?rev=689978view=rev
 : Log:
 : SOLR-737: use a constant score query for wildcards
 :
 : Added:
 : lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java   
 (with props)
 : Modified:
 : lucene/solr/trunk/CHANGES.txt
 : lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java
 : lucene/solr/trunk/src/test/org/apache/solr/ConvertedLegacyTest.java
 :
 : Modified: lucene/solr/trunk/CHANGES.txt
 : URL: 
 http://svn.apache.org/viewvc/lucene/solr/trunk/CHANGES.txt?rev=689978r1=689977r2=689978view=diff
 : 
 ==
 : --- lucene/solr/trunk/CHANGES.txt (original)
 : +++ lucene/solr/trunk/CHANGES.txt Thu Aug 28 13:54:24 2008
 : @@ -396,6 +396,10 @@
 :
 :   3. SOLR-647: reference count the SolrCore uses to prevent a premature
 :  close while a core is still in use.  (Henri Biestro, Noble Paul, yonik)
 : +
 : + 4. SOLR-737: SolrQueryParser now uses a ConstantScoreQuery for wildcard
 : +queries that prevent an exception from being thrown when the number
 : +of matching terms exceeds the BooleanQuery clause limit.  (yonik)
 :
 :  Optimizations
 :   1. SOLR-276: improve JSON writer speed. (yonik)
 :
 : Modified: 
 lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java
 : URL: 
 http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java?rev=689978r1=689977r2=689978view=diff
 : 
 ==
 : --- lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java 
 (original)
 : +++ lucene/solr/trunk/src/java/org/apache/solr/search/SolrQueryParser.java 
 Thu Aug 28 13:54:24 2008
 : @@ -22,6 +22,8 @@
 :  import org.apache.lucene.queryParser.QueryParser;
 :  import org.apache.lucene.search.ConstantScoreRangeQuery;
 :  import org.apache.lucene.search.Query;
 : +import org.apache.lucene.search.WildcardQuery;
 : +import org.apache.lucene.search.ConstantScoreQuery;
 :  import org.apache.lucene.analysis.Analyzer;
 :  import org.apache.solr.common.SolrException;
 :  import org.apache.solr.schema.FieldType;
 : @@ -144,4 +146,12 @@
 :  return new ConstantScorePrefixQuery(t);
 :}
 :
 : +  protected Query getWildcardQuery(String field, String termStr) throws 
 ParseException {
 : +Query q = super.getWildcardQuery(field, termStr);
 : +if (q instanceof WildcardQuery) {
 : +  // use a constant score query to avoid overflowing clauses
 : +  return new ConstantScoreQuery(new 
 WildcardFilter(((WildcardQuery)q).getTerm()));
 : +}
 : +return q;
 : +  }
 :  }
 :
 : Added: lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java
 : URL: 
 http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java?rev=689978view=auto
 : 
 ==
 : --- lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java 
 (added)
 : +++ lucene/solr/trunk/src/java/org/apache/solr/search/WildcardFilter.java 
 Thu Aug 28 13:54:24 2008
 : @@ -0,0 +1,103 @@
 : +/**
 : + * Licensed to the Apache Software Foundation (ASF) under one or more
 : + * contributor license agreements.  See the NOTICE file distributed with
 : + * this work for additional information regarding copyright ownership.
 : + * The ASF licenses this file to You under the Apache License, Version 2.0
 : + * (the License); you may not 

Re: svn commit: r689978 - in /lucene/solr/trunk: CHANGES.txt src/java/org/apache/solr/search/SolrQueryParser.java src/java/org/apache/solr/search/WildcardFilter.java src/test/org/apache/solr/Converted

2008-08-29 Thread Chris Hostetter

: fire.  Imagine everything working fine for months, docs being
: occasionally added, until *boom* a magic limit is hit.
: 
: I wouldn't be opposed to a config option I guess... but I think the
: default should be something that doesn't unpredictably break (so it
: still wouldn't be 100% back compatible).

I don't disagree with you (ok, i might disagree with you slightly, but 
only on the semantics of what default could mean)  I'm just saying: this 
change will break highlighting for people where it currently works, and 
there won't be anything they can do to make it work the way it use to.

Yes, it can be a timebomb for some people, but for others with small 
indexes and a managable umber of terms it works just fine.

-Hoss



[jira] Commented: (SOLR-731) CoreDescriptor.getCoreContainer should not be public

2008-08-29 Thread Henri Biestro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627035#action_12627035
 ] 

Henri Biestro commented on SOLR-731:


The issue is about having public access from CD to the CC.
I think that it would be misleading to other contributors to believe the CD is 
more than a parameter passing facility; it should not be used beyond core 
creation.

Removing the SC-CD reference altogether is a strawman feature (at least, I 
know why this code should not not make it to the trunk).
But it illustrates the point that this reference is not functionally needed 
since all information vehicled though the CD is exploited then stored elsewhere.

As you mentioned, CoreDescriptor/CoreContainer/SolrCore are intertwined so it 
is hard to cut those in pieces as I've been asked to do.
For all intent  purpose, SOLR-725 related issues are resolved in the SOLR-725 
attached patch.


 CoreDescriptor.getCoreContainer should not be public
 

 Key: SOLR-731
 URL: https://issues.apache.org/jira/browse/SOLR-731
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Henri Biestro
 Attachments: solr-731.patch


 For the very same reasons that CoreDescriptor.getCoreProperties did not need 
 to be public (aka SOLR-724)
 It also means the CoreDescriptor ctor should not need a CoreContainer
 The CoreDescriptor is only meant to be describing a to-be created SolrCore.
 However, we need access to the CoreContainer from the SolrCore now that we 
 are guaranteed the CoreContainer always exists.
 This is also a natural consequence of SOLR-647 now that the CoreContainer is 
 not a map of CoreDescriptor but a map of SolrCore.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-739) Add support for OmitTf

2008-08-29 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627049#action_12627049
 ] 

Mike Klaas commented on SOLR-739:
-

Haven't looked at the patch, but defaulting to omitTf=true is 
backwards-incompatible (think multi-valued string fields)

 Add support for OmitTf
 --

 Key: SOLR-739
 URL: https://issues.apache.org/jira/browse/SOLR-739
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-739.patch


 Allow setting omitTf in the field schema. Default to true for all but text 
 fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



solr2: Onward and Upward

2008-08-29 Thread Yonik Seeley
I've been thinking about the next major version of Solr.
Here's some brainstorming on goals/ideas:
 - use a standard IOC container for externalization of configuration
and plugins... Spring springs to mind as the obvious choice here.
May want to use other spring services such as JMX integration, and
look into JMX management (more than just statistics), etc.
 - support programatic construction and manipulation of IndexSchema, etc.
 - support some sort of standard RPC mechanism (Thrift, Etch, ???) so
strongly typed language bindings don't have to be developed for every
language (as it seems most people want).  Create an IDL for common
operations and then use a compiler to create the stubs for perl,
python, java, etc.
 - an RPC mechanism that can have multiple operations pending per
socket (and maybe use NIO) would probably be good for distributed
search, etc.
 - allow more lower level index operations... create a new index at a
given spot, merge multiple indicies, etc.
 - make Solr more scalable and cloud computing friendly... make it
easier to create and deploy clusters/shards, as well as change the
size of clusters
 - remove the single-master points of failure per-shard (support
or incorporate something like bailey)
 - make it easier to deploy config changes (possibly use
zookeeper... prob want that for cluster management anyway)
 - since solr will have the data, possibly allow plugins that
could do map-reduce, or other interfaces that enable things like
mahout.
 - support more changes w/o manual re-indexing... change the schema
and have Solr re-index in the background (assuming all data is
available via stored fields or elsewhere via a plugin)
 - support more realtime search... greatly reducing or eliminating
the lag between adding a document and making it searchable
 - support tagging type of updates... quickly updating part of a
document, or data associated with a document
 - try to expose more lower-level Lucene functionality to better
support other projects that want to embed Solr (IOC should hopefully
make Solr easier to embed and customize too)

To support some of these goals, some re-architecture is probably in
the cards.  Caching based on the IndexReader rather than the
IndexSearcher is probably one necessary change.  We should also use
this as an opportunity to clean some things up and improve the core
architecture since this will be a major version change.  But we should
also
 - continue to support the current main solr web interfaces for
searching and update
 - retain (or improve) the ease of use factor
- we should always be able to point at an existing Lucene index
and do interesting things with it
- continue to focus on single-node ease of use for small web developers

As for the future of Solr 1.x, I fully expect a Solr 1.4 release as
well as other 1.x releases after that.

Possible next steps:
  - Have discussions on solr-dev with a subject prefix of solr2:
  - We should avoid the temptation to start banging out code (unless
it's just example code) and take some time to really leverage all of
the architectural experience this larger solr-dev community brings.
  - Establish a wiki section for solr2 to capture current consensus...
but generally use solr-dev for ideas and establishing that consensus
  - let java-dev know about this (i.e. what in Solr didn't suit their
needs and how can we change that)

Onward and upward... Other thoughts  ideas?

-Yonik


[jira] Commented: (SOLR-739) Add support for OmitTf

2008-08-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627057#action_12627057
 ] 

Yonik Seeley commented on SOLR-739:
---

I think it's unlikely to matter for non-text fields, but I guess we could 
always change the default to false and then update the example schema to set 
it to true everywhere except text fields.

 Add support for OmitTf
 --

 Key: SOLR-739
 URL: https://issues.apache.org/jira/browse/SOLR-739
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-739.patch


 Allow setting omitTf in the field schema. Default to true for all but text 
 fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned SOLR-684:
-

Assignee: Hoss Man

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial

 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627059#action_12627059
 ] 

Hoss Man commented on SOLR-684:
---

r690357 should help us solve this problem ... i've made the corrisponding 
changes to the Hudson Solr-trunk config and manually kicked off a build.

if it works, i'll merge to the 1.3 branch.

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Priority: Trivial

 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Grant Ingersoll


On Aug 29, 2008, at 12:05 PM, Steven A Rowe wrote:

Random nit: in release-candidate/build/docs/who.pdf, Otis's name is  
spelled Otis Gospodneti# (final character is a hash mark). - Steve


Hmm, that's weird.  It's also that way on the current site.




On 08/29/2008 at 11:16 AM, Grant Ingersoll wrote:

I created a Hudson task to do the building/archival tasks for the
release candidates.It is a on-demand task (i.e. not scheduled)

See http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
didate/ for the job in general.

The artifacts (including Maven) are at:
http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
didate/lastSuccessfulBuild/artifact/

The web site (including javadocs):
http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
didate/lastSuccessfulBuild/artifact/release-candidate/build/do
cs/index.html

I haven't gone through with a fine tooth comb yet, hence the
prototype in the subject line, but my preliminary skimming of it
seems like it is on track.   I will cover it more later
today.  In the
meantime, feedback is appreciated.

Cheers,
Grant







--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









[jira] Commented: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Lars Kotthoff (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627067#action_12627067
 ] 

Lars Kotthoff commented on SOLR-684:


Hmmm, doesn't look like it :(

[exec] Execute failed: java.io.IOException: 
/opt/subversion-1.4.5/bin/svnversion: not found

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial

 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: solr2: Onward and Upward

2008-08-29 Thread Grant Ingersoll


On Aug 29, 2008, at 2:03 PM, Yonik Seeley wrote:


I've been thinking about the next major version of Solr.
Here's some brainstorming on goals/ideas:
- use a standard IOC container for externalization of configuration
and plugins... Spring springs to mind as the obvious choice here.
May want to use other spring services such as JMX integration, and
look into JMX management (more than just statistics), etc.
- support programatic construction and manipulation of IndexSchema,  
etc.


Definitely.  I'm a firm believer we spend too much time on  
configuration workarounds.  Also, the IOC layer definitely makes the  
second point here trivial.


It's maybe worthwhile to at least _consider_ being able to transform  
1.x configurations to 2.x, but not saying we have to.




- support some sort of standard RPC mechanism (Thrift, Etch, ???) so
strongly typed language bindings don't have to be developed for every
language (as it seems most people want).  Create an IDL for common
operations and then use a compiler to create the stubs for perl,
python, java, etc.
- an RPC mechanism that can have multiple operations pending per
socket (and maybe use NIO) would probably be good for distributed
search, etc.
- allow more lower level index operations... create a new index at a
given spot, merge multiple indicies, etc.


+1



- make Solr more scalable and cloud computing friendly... make it
easier to create and deploy clusters/shards, as well as change the
size of clusters


People are definitely pushing on the scale front, it will be good to  
have it baked in from the ground up.




- remove the single-master points of failure per-shard (support
or incorporate something like bailey)
- make it easier to deploy config changes (possibly use
zookeeper... prob want that for cluster management anyway)
- since solr will have the data, possibly allow plugins that
could do map-reduce, or other interfaces that enable things like
mahout.


Ah, the marriage of Solr and Mahout.  Words cannot express my joy.   
Think automatic classification and named entity recognition over large  
scale distributed collections, all faceted and categorized and tied  
together in eternal bliss.  Sigh.  (Dang, that guy's weird!)


See also https://issues.apache.org/jira/browse/SOLR-651

I think it's also useful to think about how other NLP type tools plug  
in (i.e. sentence/paragraph detection, POS taggers, clustering,  
categorization, etc.)  Solr, thanks to it's pluggable output and  
SearchComponent/Req Handler architecture can actually play quite well  
with these things.




- support more changes w/o manual re-indexing... change the schema
and have Solr re-index in the background (assuming all data is
available via stored fields or elsewhere via a plugin)


Cool



- support more realtime search... greatly reducing or eliminating
the lag between adding a document and making it searchable


One of the big things people often want



- support tagging type of updates... quickly updating part of a
document, or data associated with a document
- try to expose more lower-level Lucene functionality to better
support other projects that want to embed Solr (IOC should hopefully
make Solr easier to embed and customize too)


Yep.




To support some of these goals, some re-architecture is probably in
the cards.  Caching based on the IndexReader rather than the
IndexSearcher is probably one necessary change.  We should also use
this as an opportunity to clean some things up and improve the core
architecture since this will be a major version change.  But we should
also
- continue to support the current main solr web interfaces for
searching and update


Definitely a huge win.



- retain (or improve) the ease of use factor
   - we should always be able to point at an existing Lucene index
and do interesting things with it


Even w/o a schema?



   - continue to focus on single-node ease of use for small web  
developers



Yes, this is the majority of users, I would guess, i.e. sites in the  
range of less than 10 million docs.





As for the future of Solr 1.x, I fully expect a Solr 1.4 release as
well as other 1.x releases after that.

Possible next steps:
 - Have discussions on solr-dev with a subject prefix of solr2:
 - We should avoid the temptation to start banging out code (unless
it's just example code) and take some time to really leverage all of
the architectural experience this larger solr-dev community brings.
 - Establish a wiki section for solr2 to capture current consensus...
but generally use solr-dev for ideas and establishing that consensus
 - let java-dev know about this (i.e. what in Solr didn't suit their
needs and how can we change that)

Onward and upward... Other thoughts  ideas?


Better support for Spans, Payloads, Term Vectors.  Granted, Spans just  
need support via a query parser and the results written to the output,  
but Payloads are a bit trickier when it comes to the indexing side of  
thing.


-Grant



Re: solr2: Onward and Upward

2008-08-29 Thread Erik Hatcher


On Aug 29, 2008, at 2:58 PM, Grant Ingersoll wrote:

Onward and upward... Other thoughts  ideas?


Better support for Spans, Payloads, Term Vectors.  Granted, Spans  
just need support via a query parser and the results written to the  
output, but Payloads are a bit trickier when it comes to the  
indexing side of thing.


And let's not forget support for the tee token filter :)

Erik



[jira] Commented: (SOLR-540) Add support for hl.fl=*

2008-08-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627071#action_12627071
 ] 

David Smiley commented on SOLR-540:
---

I'd like this incorporated into Solr.  I stupidly didn't search for this 
feature and I went and did it myself... was just about to submit my own patch.  
Doh!

 Add support for hl.fl=*
 ---

 Key: SOLR-540
 URL: https://issues.apache.org/jira/browse/SOLR-540
 Project: Solr
  Issue Type: New Feature
  Components: highlighter
Affects Versions: 1.3
 Environment: Tomcat 5.5
Reporter: Lars Kotthoff
Priority: Minor
 Attachments: SOLR-540-highlight-all.patch, 
 SOLR-540-highlight-all.patch, SOLR-540-highlight-all.patch


 Adds support for the star value for the hl.fl parameter, i.e. highlighting 
 will be done on all fields (static and dynamic). Particularly useful in 
 conjunction with hl.requireFieldMatch=true, this way one can specify 
 generic highlighting parameters independent of the query/searched fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



docs nad src jars in tgz and zip distributions?

2008-08-29 Thread Chris Hostetter


checking out 1.3 RC1 i notice that the src Jars and docs Jars 
generated for maven (containing the source and and javadocs for each 
corrisponding code jar) are included in the dist directory of the 
release.


It makes sense to have these for maven users -- but they seem like 
overkill for normal tgz and zip file releases, which already contain the 
src and docs directly.


Was leaving these in dist to be included in the tgz/zip files just an 
oversight, or was it intentional that they be included in the conventional 
release artifacts?  Is there an advantage to them that i'm not aware of?


(FWIW: elimitating them would reduce the size of the tgz file about 13% 
... from 23M to 20M)



-Hoss



svnversion on lucene.zones.a.o ?

2008-08-29 Thread Chris Hostetter


can someone with a lucene zone account do me a favor and run locate 
svnversion on that box for me and reply with the output?


I thought i had a decent solution for SOLR-684, but i forgot that all the 
Lucene builds acutally execute on the lucene zone (which aparently doesn't 
have the svn tools installed in the same place as hudson.zones.a.o (and 
aparently i have a hudson zone account but never bothered to get a lucene 
zone account)




-Hoss



Re: docs nad src jars in tgz and zip distributions?

2008-08-29 Thread Yonik Seeley
Doesn't seem like maven artifacts should be in the download... that's
what the maven repo is for, right?

-Yonik

On Fri, Aug 29, 2008 at 3:07 PM, Chris Hostetter
[EMAIL PROTECTED] wrote:

 checking out 1.3 RC1 i notice that the src Jars and docs Jars generated
 for maven (containing the source and and javadocs for each corrisponding
 code jar) are included in the dist directory of the release.

 It makes sense to have these for maven users -- but they seem like overkill
 for normal tgz and zip file releases, which already contain the src and docs
 directly.

 Was leaving these in dist to be included in the tgz/zip files just an
 oversight, or was it intentional that they be included in the conventional
 release artifacts?  Is there an advantage to them that i'm not aware of?

 (FWIW: elimitating them would reduce the size of the tgz file about 13% ...
 from 23M to 20M)


 -Hoss




Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Otis Gospodnetic
Grant, here is what it's supposed to be:  Gospodnetić

 
If Forrest and friends don't like that diacritic, I suppose I can live with 
Gospodnetic -- damn i18n! ;)

This is what I see locally:

$ ffxg Gospod 
./src/site/src/documentation/content/xdocs/who.xml:  liOtis 
Gospodneti#263;/li

$ find . -name \*html | xargs grep Gospod 
./site/who.html:liOtis Gospodnetić/li

So the HTML looks OK, but apparently the PDF does not.  I don't know how else 
to specify that c with a diacritic other than with that #263; .  $10 that 
Steve knows!

Thank you,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Grant Ingersoll [EMAIL PROTECTED]
 To: solr-dev@lucene.apache.org
 Sent: Friday, August 29, 2008 2:42:26 PM
 Subject: Re: prototype Solr 1.3 RC 1
 
 
 On Aug 29, 2008, at 12:05 PM, Steven A Rowe wrote:
 
  Random nit: in release-candidate/build/docs/who.pdf, Otis's name is  
  spelled Otis Gospodneti# (final character is a hash mark). - Steve
 
 Hmm, that's weird.  It's also that way on the current site.
 
 
 
  On 08/29/2008 at 11:16 AM, Grant Ingersoll wrote:
  I created a Hudson task to do the building/archival tasks for the
  release candidates.It is a on-demand task (i.e. not scheduled)
 
  See http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
  didate/ for the job in general.
 
  The artifacts (including Maven) are at:
  http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
  didate/lastSuccessfulBuild/artifact/
 
  The web site (including javadocs):
  http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
  didate/lastSuccessfulBuild/artifact/release-candidate/build/do
  cs/index.html
 
  I haven't gone through with a fine tooth comb yet, hence the
  prototype in the subject line, but my preliminary skimming of it
  seems like it is on track.   I will cover it more later
  today.  In the
  meantime, feedback is appreciated.
 
  Cheers,
  Grant
 
 
 
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com
 
 Lucene Helpful Hints:
 http://wiki.apache.org/lucene-java/BasicsOfPerformance
 http://wiki.apache.org/lucene-java/LuceneFAQ



Re: svnversion on lucene.zones.a.o ?

2008-08-29 Thread Yonik Seeley
I put it in my path just recently (and that old nightly build runs under my id).

-bash-3.00$ type svnversion
svnversion is /export/home/yonik/bin/svnversion
-bash-3.00$ ls -l /export/home/yonik/bin/svnversion
lrwxrwxrwx   1 yonikother 38 Aug 27 19:18
/export/home/yonik/bin/svnversion -
/opt/subversion-current/bin/svnversion
-bash-3.00$

-Yonik

On Fri, Aug 29, 2008 at 3:10 PM, Chris Hostetter
[EMAIL PROTECTED] wrote:

 can someone with a lucene zone account do me a favor and run locate
 svnversion on that box for me and reply with the output?

 I thought i had a decent solution for SOLR-684, but i forgot that all the
 Lucene builds acutally execute on the lucene zone (which aparently doesn't
 have the svn tools installed in the same place as hudson.zones.a.o (and
 aparently i have a hudson zone account but never bothered to get a lucene
 zone account)



 -Hoss


Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Otis Gospodnetic
Maybe this is the answer:
  http://forrest.apache.org/docs_0_90/faq.html#encoding

And this is what we've got:

$ head -1 src/site/src/documentation/content/xdocs/who.xml 
?xml version=1.0?

Sounds like something that would be good to fix in general, but I don't have 
forrest set up to try it :(


Otis




- Original Message 
 From: Otis Gospodnetic [EMAIL PROTECTED]
 To: solr-dev@lucene.apache.org
 Sent: Friday, August 29, 2008 3:13:39 PM
 Subject: Re: prototype Solr 1.3 RC 1
 
 Grant, here is what it's supposed to be:  Gospodnetić
 
 
 If Forrest and friends don't like that diacritic, I suppose I can live with 
 Gospodnetic -- damn i18n! ;)
 
 This is what I see locally:
 
 $ ffxg Gospod 
 ./src/site/src/documentation/content/xdocs/who.xml:  
* Otis Gospodnetić
 
 $ find . -name \*html | xargs grep Gospod 
 ./site/who.html:
* Otis Gospodnetić
 
 So the HTML looks OK, but apparently the PDF does not.  I don't know how else 
 to 
 specify that c with a diacritic other than with that ć .  $10 that Steve 
 knows!
 
 Thank you,
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
  From: Grant Ingersoll 
  To: solr-dev@lucene.apache.org
  Sent: Friday, August 29, 2008 2:42:26 PM
  Subject: Re: prototype Solr 1.3 RC 1
  
  
  On Aug 29, 2008, at 12:05 PM, Steven A Rowe wrote:
  
   Random nit: in release-candidate/build/docs/who.pdf, Otis's name is  
   spelled Otis Gospodneti# (final character is a hash mark). - Steve
  
  Hmm, that's weird.  It's also that way on the current site.
  
  
  
   On 08/29/2008 at 11:16 AM, Grant Ingersoll wrote:
   I created a Hudson task to do the building/archival tasks for the
   release candidates.It is a on-demand task (i.e. not scheduled)
  
   See http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
   didate/ for the job in general.
  
   The artifacts (including Maven) are at:
   http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
   didate/lastSuccessfulBuild/artifact/
  
   The web site (including javadocs):
   http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
   didate/lastSuccessfulBuild/artifact/release-candidate/build/do
   cs/index.html
  
   I haven't gone through with a fine tooth comb yet, hence the
   prototype in the subject line, but my preliminary skimming of it
   seems like it is on track.   I will cover it more later
   today.  In the
   meantime, feedback is appreciated.
  
   Cheers,
   Grant
  
  
  
  
  
  --
  Grant Ingersoll
  http://www.lucidimagination.com
  
  Lucene Helpful Hints:
  http://wiki.apache.org/lucene-java/BasicsOfPerformance
  http://wiki.apache.org/lucene-java/LuceneFAQ



Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Chris Hostetter

: If Forrest and friends don't like that diacritic, I suppose I can live 
: with Gospodnetic -- damn i18n! ;)

I seem to recall that we had this problem with forrest and the Lucene-Java 
who page as well ... over there you are listed in lowly ASCII, without 
your I18N goodness.

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Grant Ingersoll [EMAIL PROTECTED]
 To: solr-dev@lucene.apache.org
 Sent: Friday, August 29, 2008 2:42:26 PM
 Subject: Re: prototype Solr 1.3 RC 1
 
 
 On Aug 29, 2008, at 12:05 PM, Steven A Rowe wrote:
 
  Random nit: in release-candidate/build/docs/who.pdf, Otis's name is  
  spelled Otis Gospodneti# (final character is a hash mark). - Steve
 
 Hmm, that's weird.  It's also that way on the current site.
 
 
 
  On 08/29/2008 at 11:16 AM, Grant Ingersoll wrote:
  I created a Hudson task to do the building/archival tasks for the
  release candidates.It is a on-demand task (i.e. not scheduled)
 
  See http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
  didate/ for the job in general.
 
  The artifacts (including Maven) are at:
  http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
  didate/lastSuccessfulBuild/artifact/
 
  The web site (including javadocs):
  http://hudson.zones.apache.org/hudson/job/Solr%20Release%20Can
  didate/lastSuccessfulBuild/artifact/release-candidate/build/do
  cs/index.html
 
  I haven't gone through with a fine tooth comb yet, hence the
  prototype in the subject line, but my preliminary skimming of it
  seems like it is on track.   I will cover it more later
  today.  In the
  meantime, feedback is appreciated.
 
  Cheers,
  Grant
 
 
 
 
 
 --
 Grant Ingersoll
 http://www.lucidimagination.com
 
 Lucene Helpful Hints:
 http://wiki.apache.org/lucene-java/BasicsOfPerformance
 http://wiki.apache.org/lucene-java/LuceneFAQ

: 



-Hoss



[jira] Commented: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627077#action_12627077
 ] 

Hoss Man commented on SOLR-684:
---

yeah ... i forgot the solr hudson builds are actually run on the Lucene zone 
machine (which has a slightly different setup then the hudson zone) ... let's 
see how build#551 goes

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial

 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Chris Hostetter

: Maybe this is the answer:
:  http://forrest.apache.org/docs_0_90/faq.html#encoding

my reading of that is that setting an encoding will let you use the 
litteral UTF-8 character 9which is what i would expect) but not the end of 
the answer...

 Another option is to use character entities such as ouml;  (ö) or 
 the numeric form #246;  (ö).

...which is what we are doing.  I suspect the PDF formatter just doesn't 
play nicely with the non-trivial UTF-8 characters.


-Hoss


Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Grant Ingersoll


On Aug 29, 2008, at 3:18 PM, Otis Gospodnetic wrote:


Maybe this is the answer:
 http://forrest.apache.org/docs_0_90/faq.html#encoding

And this is what we've got:

$ head -1 src/site/src/documentation/content/xdocs/who.xml
?xml version=1.0?

Sounds like something that would be good to fix in general, but I  
don't have forrest set up to try it :(


It's easy to setup ;-)  http://forrest.apache.org. 
 


Re: docs nad src jars in tgz and zip distributions?

2008-08-29 Thread Grant Ingersoll

Yeah, no need for them to be in there.  I'll take care of it.


On Aug 29, 2008, at 3:07 PM, Chris Hostetter wrote:



checking out 1.3 RC1 i notice that the src Jars and docs Jars  
generated for maven (containing the source and and javadocs for each  
corrisponding code jar) are included in the dist directory of the  
release.


It makes sense to have these for maven users -- but they seem like  
overkill for normal tgz and zip file releases, which already contain  
the src and docs directly.


Was leaving these in dist to be included in the tgz/zip files just  
an oversight, or was it intentional that they be included in the  
conventional release artifacts?  Is there an advantage to them that  
i'm not aware of?


(FWIW: elimitating them would reduce the size of the tgz file about  
13% ... from 23M to 20M)



-Hoss






[jira] Commented: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Lars Kotthoff (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627080#action_12627080
 ] 

Lars Kotthoff commented on SOLR-684:


Looks good :)

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial

 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-334) pluggable query parsers

2008-08-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627081#action_12627081
 ] 

David Smiley commented on SOLR-334:
---

I intend on submitting a patch very soon which I think is related to this.  
There are two parts to it,
1. Enhancing the DisjunctionMaxQueryParser to work on all the query variants 
such as wildcard, prefix, and fuzzy queries.  This was not in Solr already 
because DisMax was only used for a very limited syntax that didn't use those 
features.  In my opinion, this makes a more suitable base parser for general 
use because unlike the Lucene/Solr parser, this one supports multiple default 
fields whereas other ones (say your !prefix one for example, can't do 
dismax).  The notion of a single default field is antiquated and a technical 
under-the-hood detail of Lucene that I think Solr should shield the user from 
by on-the-fly using a DisMax when multiple fields are used.

2. Enhancing the DisMax QParser plugin to use a pluggable query string 
re-writer (via subclass extension) instead of the logic currently embedded 
within it (i.e. the escape nearly everything logic).  Additionally, made this 
QParser have a notion of a simple syntax (the default) or non-simple in which 
case some of the logic in this QParser doesn't occur because it's irrelevant 
(phrase boosting and min-should-max in particular).  As part of my work I 
significantly moved the code around to make it clearer and more extensible.

Should I submit a new issue for this or add to this one?

 pluggable query parsers
 ---

 Key: SOLR-334
 URL: https://issues.apache.org/jira/browse/SOLR-334
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: angle2curly.patch, qparser.patch, qparser.patch, 
 qparser.patch


 One should be able to optionally specify an alternate query syntax on a 
 per-query basis
 http://www.nabble.com/Using-HTTP-Post-for-Queries-tf3039973.html#a8483387
 Many benefits, including avoiding the need to do query parser escaping for 
 simple term or prefix queries.
 Possible Examples:
 fq=!term field=myfieldThe Term
 fq=!prefix field=myfieldThe Prefix
 q=!qp op=ANDa b
 q=!xml?xml...  // lucene XML query syntax?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-284) Parsing Rich Document Types

2008-08-29 Thread Chris Harris (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Harris updated SOLR-284:
--

Attachment: un-hardcode-id.diff

The patch, as currently stands, treats a field called id as a special case. 
First, it is a required field. Second, unlike any other field, you don't need 
to declare it in the fieldnames parameter. Finally, since the 
fieldSolrParams.getInt(), that field is required to be an int.

This special-case treatment seems a little too particular to me; not everyone 
wants to have a field called id, and not everyone who does wants that field 
to be an int. So what I propose is to eliminate the special treatment of id. 
See un-hardcode-id.diff for what this might mean in particular. (That file is 
not complete; to correctly make this change, I'd have to update the test cases.)

This is a breaking change, because if you *are* using an id field, you'll now 
have to specifically indicate that fact in the fieldnames parameter. Thus, 
instead of

http://localhost:8983/solr/update/rich?stream.file=myfile.docstream.type=docid=100stream.fieldname=textfieldnames=subject,authorsubject=mysubjectauthor=eric

you'll have to put

http://localhost:8983/solr/update/rich?stream.file=myfile.docstream.type=docid=100stream.fieldname=textfieldnames=id,subject,authorsubject=mysubjectauthor=eric

I think asking users of this patch to make this slight change in their client 
code is not an unreasonable burden, but I'm curious what Eric and others have 
to say.

 Parsing Rich Document Types
 ---

 Key: SOLR-284
 URL: https://issues.apache.org/jira/browse/SOLR-284
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Eric Pugh
 Fix For: 1.4

 Attachments: libs.zip, rich.patch, rich.patch, rich.patch, 
 rich.patch, rich.patch, source.zip, test-files.zip, test-files.zip, test.zip, 
 un-hardcode-id.diff


 I have developed a RichDocumentRequestHandler based on the CSVRequestHandler 
 that supports streaming a PDF, Word, Powerpoint, Excel, or PDF document into 
 Solr.
 There is a wiki page with information here: 
 http://wiki.apache.org/solr/UpdateRichDocuments
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-334) pluggable query parsers

2008-08-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627084#action_12627084
 ] 

Yonik Seeley commented on SOLR-334:
---

bq. Should I submit a new issue for this or add to this one?

Should definitely get it's own issue.

 pluggable query parsers
 ---

 Key: SOLR-334
 URL: https://issues.apache.org/jira/browse/SOLR-334
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.3

 Attachments: angle2curly.patch, qparser.patch, qparser.patch, 
 qparser.patch


 One should be able to optionally specify an alternate query syntax on a 
 per-query basis
 http://www.nabble.com/Using-HTTP-Post-for-Queries-tf3039973.html#a8483387
 Many benefits, including avoiding the need to do query parser escaping for 
 simple term or prefix queries.
 Possible Examples:
 fq=!term field=myfieldThe Term
 fq=!prefix field=myfieldThe Prefix
 q=!qp op=ANDa b
 q=!xml?xml...  // lucene XML query syntax?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-334) pluggable query parsers

2008-08-29 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-334.
---

   Resolution: Fixed
Fix Version/s: 1.3

resolving this issue.

 pluggable query parsers
 ---

 Key: SOLR-334
 URL: https://issues.apache.org/jira/browse/SOLR-334
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.3

 Attachments: angle2curly.patch, qparser.patch, qparser.patch, 
 qparser.patch


 One should be able to optionally specify an alternate query syntax on a 
 per-query basis
 http://www.nabble.com/Using-HTTP-Post-for-Queries-tf3039973.html#a8483387
 Many benefits, including avoiding the need to do query parser escaping for 
 simple term or prefix queries.
 Possible Examples:
 fq=!term field=myfieldThe Term
 fq=!prefix field=myfieldThe Prefix
 q=!qp op=ANDa b
 q=!xml?xml...  // lucene XML query syntax?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-740) legacy gettableFiles support not working in 1.3.0-RC1

2008-08-29 Thread Hoss Man (JIRA)
legacy gettableFiles support not working in 1.3.0-RC1
-

 Key: SOLR-740
 URL: https://issues.apache.org/jira/browse/SOLR-740
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Hoss Man
 Fix For: 1.3


i haven't had a chance to drill into this yet to figure out what's going wrong 
(i know we already had an issue about this that was fixed), but testing the 
example configs from 1.2 against 1.3rc1 i'm seeing the following behavior...

# the following message is logged...{code}
Aug 29, 2008 12:41:31 PM org.apache.solr.core.SolrCore initDeprecatedSupport
WARNING: adding ShowFileRequestHandler with hidden files: [.SVN, XSLT, 
SYNONYMS.TXT, PROTWORDS.TXT, STOPWORDS.TXT, SCRIPTS.CONF]
{code} (Note: that is not the list of files configured in the 1.2 example 
gettableFiles, it's the list of all files in the solr/conf dir ... and for 
some reason they are all uppercased)
# links on the admin screen for the schema and config files point to...{code}
file/?file=schema.xml   ...ie...  
http://localhost:8983/solr/admin/file/?file=schema.xml
file/?file=solrconfig.xml  ...ie... 
http://localhost:8983/solr/admin/file/?file=solrconfig.xml
{code}
# those links don't work (404, nothing seems to be logged by Solr)
# the legacy form of the urls using get-files.jsp (which people might have 
bookmarked) *do* in fact work...{code}
http://localhost:8983/solr/admin/get-file.jsp?file=solrconfig.xml
http://localhost:8983/solr/admin/get-file.jsp?file=schema.xml
{code}...but based on the whitespace at the top of the files, i suspect that is 
relaly the JSP getting used, not the  ShowFileRequestHandler

To reproduce:
# checkout the the solr 1.2 tag.
# copy the 1.3-rc1 war on top of the 1.2 example/webapps/solr.war
# run the 1.2 example code as normal (java -jar start.jar)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-684:
--

Fix Version/s: 1.3

I've merged to 1.3 in r690384 and made the appropriate Hudson config changes 
for Grant's shiny new Hudson Solr Release Candidate setup, so 1.3-RC2 and on 
should be golden.

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial
 Fix For: 1.3


 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-684.
---

Resolution: Fixed

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial
 Fix For: 1.3


 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: solr2: Onward and Upward

2008-08-29 Thread Shalin Shekhar Mangar
On Fri, Aug 29, 2008 at 11:33 PM, Yonik Seeley [EMAIL PROTECTED] wrote:

 I've been thinking about the next major version of Solr.
 Here's some brainstorming on goals/ideas:
  - use a standard IOC container for externalization of configuration
 and plugins... Spring springs to mind as the obvious choice here.
 May want to use other spring services such as JMX integration, and
 look into JMX management (more than just statistics), etc.
  - support programatic construction and manipulation of IndexSchema, etc.
  - support some sort of standard RPC mechanism (Thrift, Etch, ???) so
 strongly typed language bindings don't have to be developed for every
 language (as it seems most people want).  Create an IDL for common
 operations and then use a compiler to create the stubs for perl,
 python, java, etc.
 - an RPC mechanism that can have multiple operations pending per
 socket (and maybe use NIO) would probably be good for distributed
 search, etc.
  - allow more lower level index operations... create a new index at a
 given spot, merge multiple indicies, etc.
  - make Solr more scalable and cloud computing friendly... make it
 easier to create and deploy clusters/shards, as well as change the
 size of clusters
 - remove the single-master points of failure per-shard (support
 or incorporate something like bailey)
 - make it easier to deploy config changes (possibly use
 zookeeper... prob want that for cluster management anyway)
 - since solr will have the data, possibly allow plugins that
 could do map-reduce, or other interfaces that enable things like
 mahout.
  - support more changes w/o manual re-indexing... change the schema
 and have Solr re-index in the background (assuming all data is
 available via stored fields or elsewhere via a plugin)
  - support more realtime search... greatly reducing or eliminating
 the lag between adding a document and making it searchable
  - support tagging type of updates... quickly updating part of a
 document, or data associated with a document
  - try to expose more lower-level Lucene functionality to better
 support other projects that want to embed Solr (IOC should hopefully
 make Solr easier to embed and customize too)

 To support some of these goals, some re-architecture is probably in
 the cards.  Caching based on the IndexReader rather than the
 IndexSearcher is probably one necessary change.  We should also use
 this as an opportunity to clean some things up and improve the core
 architecture since this will be a major version change.  But we should
 also
  - continue to support the current main solr web interfaces for
 searching and update
  - retain (or improve) the ease of use factor
- we should always be able to point at an existing Lucene index
 and do interesting things with it
- continue to focus on single-node ease of use for small web developers

 As for the future of Solr 1.x, I fully expect a Solr 1.4 release as
 well as other 1.x releases after that.

 Possible next steps:
  - Have discussions on solr-dev with a subject prefix of solr2:
  - We should avoid the temptation to start banging out code (unless
 it's just example code) and take some time to really leverage all of
 the architectural experience this larger solr-dev community brings.
  - Establish a wiki section for solr2 to capture current consensus...
 but generally use solr-dev for ideas and establishing that consensus
  - let java-dev know about this (i.e. what in Solr didn't suit their
 needs and how can we change that)

 Onward and upward... Other thoughts  ideas?



You're a mind reader ;-)

Noble and I have been discussing many of the same things. Prominent topics
have included real time search, eliminating dependency on master (a torrent
like replication, well not exactly, but close to that), map-reduce support,
exposing operations through JMX (not just read-only statistics), integrating
the work done on Mahout, a cross-language binary format (using Thrift, see
THRIFT-110, THRIFT-122).

Another area was Solr should learn from it's mistakes :-)
Basically, this is related to providing ways for applications to give
feedback to Solr -- querylog/clickstream analysis or direct feedback for
better search, more like this and spelling suggestions.

-- 
Regards,
Shalin Shekhar Mangar.


[jira] Commented: (SOLR-684) Hudson builds do not have the SVN revision because svnversion is not available

2008-08-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627108#action_12627108
 ] 

Hoss Man commented on SOLR-684:
---

bq. When you get something like 690356:690357M it's highly unlikely that this 
will help you to uniquely identify the revision as a whole or recreate it, 
which I think is the point of including it in the manifest.

True: but it will tell you that you *can't* recreate it purely from the 
repository.  having the svnversion info is just an added bit of security 
blanket for people who later want a starting point to understand where the 
release came from -- when it's a single number, you have a good starting point. 
 when it's mixed you have an ambiguous starting point, and when it's got an M 
all bets are off.

 Hudson builds do not have the SVN revision because svnversion is not available
 --

 Key: SOLR-684
 URL: https://issues.apache.org/jira/browse/SOLR-684
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Lars Kotthoff
Assignee: Hoss Man
Priority: Trivial
 Fix For: 1.3


 The build file tries to run svnversion when generating the jar manifest to 
 include the revision number. This fails in Hudson however as the svnversion 
 executable is not available.
 This could be addressed by installing svnversion on the build machine or 
 using alternative means of determining the revision number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: prototype Solr 1.3 RC 1

2008-08-29 Thread Chris Hostetter

: I haven't gone through with a fine tooth comb yet, hence the prototype in
: the subject line, but my preliminary skimming of it seems like it is on track.
: I will cover it more later today.  In the meantime, feedback is appreciated.

I've done some testing with both the Solr 1.2 example and the configs from 
my apachecon demo last year, the only problem that jumped out at me was 
SOLR-740.

FWIW: I've also commited some usage/javadoc fixes.


-Hoss



[jira] Created: (SOLR-741) Add support for rounding dates in DateField

2008-08-29 Thread Shalin Shekhar Mangar (JIRA)
Add support for rounding dates in DateField
---

 Key: SOLR-741
 URL: https://issues.apache.org/jira/browse/SOLR-741
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html

Since rounding dates to a coarse value is an often recommended solution to 
decrease number of unique terms, we should add support for doing this in 
DateField itself. A number of syntax were proposed, some of them were:
# fieldType name=date class=solr.DateField 
sortMissingLast=trueomitNorms=true roundTo=-1MINUTE / (Shalin)
# fieldType name=date class=solr.DateField sortMissingLast=true 
omitNorms=true round=DOWN_MINUTE / (Otis)

Hoss proposed more general enhancements related to arbitary pre-processing of 
values prior to indexing/storing using pre-processing analyzers.

This issue aims to build a consensus on the solution to pursue and to implement 
that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-742) Unable to create dynamic fields with custom DataImportHandler transformer

2008-08-29 Thread Wojtek Piaseczny (JIRA)
Unable to create dynamic fields with custom DataImportHandler transformer
-

 Key: SOLR-742
 URL: https://issues.apache.org/jira/browse/SOLR-742
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Wojtek Piaseczny
 Fix For: 1.3


Discussion at: 
http://www.nabble.com/Creating-dynamic-fields-with-DataImportHandler-to19226532.html

Dynamic fields aren't created when specified in a DataImportHandler's 
transformer. 

Reproducing the issue:
I have defined a dynamic field (of type sdouble) in my schema called 
_dynamic*. Inside the transformer's transformRow method, I am adding the 
name-value pair _dynamicTest and '1.0'. No errors are observed, but the data 
does not appear in the index after importing is complete.

Interestingly, I can specify that same name-value pair combination in the 
DataImportHandler's config file, and it does appear in the index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Forrest PDF non-Latin-1 support [was: RE: prototype Solr 1.3 RC 1]

2008-08-29 Thread Steven A Rowe
On 08/29/2008 at 3:24 PM, Chris Hostetter wrote:
 I suspect the PDF formatter just doesn't play nicely with the
 non-trivial UTF-8 characters.

This is an Apache FOP FAQ; from 
http://xmlgraphics.apache.org/fop/faq.html#pdf-characters:

   6.2. Some characters are not displayed, or displayed
incorrectly, or displayed as #.

   This usually means the selected font doesn't have a
   glyph for the character.

   The standard text fonts supplied with Acrobat Reader have
   mostly glyphs for characters from the ISO Latin 1 character
   set. [...]

   If you use your own fonts, the font must have a glyph for the
   desired character. Furthermore the font must be available on
   the machine where the PDF is viewed or it must have been
   embedded in the PDF file. [...]

There's an open Forrest bug for this problem: 
https://issues.apache.org/jira/browse/FOR-132, and the discussion there 
includes a link to the Cocoon documentation for embedding fonts in PDF files: 
http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html#FOP+and+Embedding+Fonts.

This looks kinda complicated, and AFAICT would require modifications to the 
Forrest installation wherever the site is built.

I suspect that almost nobody looks at the PDF version of the Who we are page 
(and I sure am sorry now that I brought this up...)

If things are left as-is, Otis's last name would be displayed properly in the 
HTML, and garbled in the PDF; if the diacritic is removed, then it will be 
displayed improperly in both places :)

Steve


[jira] Updated: (SOLR-742) Unable to create dynamic fields with custom DataImportHandler transformer

2008-08-29 Thread Wojtek Piaseczny (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wojtek Piaseczny updated SOLR-742:
--

Fix Version/s: (was: 1.3)
   1.4

 Unable to create dynamic fields with custom DataImportHandler transformer
 -

 Key: SOLR-742
 URL: https://issues.apache.org/jira/browse/SOLR-742
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Wojtek Piaseczny
 Fix For: 1.4


 Discussion at: 
 http://www.nabble.com/Creating-dynamic-fields-with-DataImportHandler-to19226532.html
 Dynamic fields aren't created when specified in a DataImportHandler's 
 transformer. 
 Reproducing the issue:
 I have defined a dynamic field (of type sdouble) in my schema called 
 _dynamic*. Inside the transformer's transformRow method, I am adding the 
 name-value pair _dynamicTest and '1.0'. No errors are observed, but the 
 data does not appear in the index after importing is complete.
 Interestingly, I can specify that same name-value pair combination in the 
 DataImportHandler's config file, and it does appear in the index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-739) Add support for OmitTf

2008-08-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627129#action_12627129
 ] 

Hoss Man commented on SOLR-739:
---

FWIW: this is a situation where reving the schema version could make sense as 
well (just as we did when adding multiValued, we want the default ot change but 
not forr existing users)

 Add support for OmitTf
 --

 Key: SOLR-739
 URL: https://issues.apache.org/jira/browse/SOLR-739
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-739.patch


 Allow setting omitTf in the field schema. Default to true for all but text 
 fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: solr2: Onward and Upward

2008-08-29 Thread Yonik Seeley
On Fri, Aug 29, 2008 at 2:03 PM, Yonik Seeley [EMAIL PROTECTED] wrote:
  - allow more lower level index operations... create a new index at a
 given spot, merge multiple indicies, etc.

 - possibly add the ability to pull a lucene index from hdfs (via a
plugin if we don't want a hard dependency on hdfs)
   or from another solr server (like the in-development java based
replication will allow).

  - make Solr more scalable and cloud computing friendly... make it
 easier to create and deploy clusters/shards, as well as change the
 size of clusters
 - remove the single-master points of failure per-shard (support
 or incorporate something like bailey)
 - make it easier to deploy config changes (possibly use
 zookeeper... prob want that for cluster management anyway)

People interested in the scalability part should look at bailey
(people got busy and discussion died down, but there's some
interesting stuff).
http://sourceforge.net/mailarchive/forum.php?forum_name=bailey-developers
http://bailey.wiki.sourceforge.net/

-Yonik


Re: update response warning

2008-08-29 Thread Chris Hostetter

: DataImportHandler should also be considered experimental.

Yonik was specificly refering to response formats/structures, many of 
which have had a warning in them cautioning people not to write code that 
depends on the exct format (ie: programmaticly parsing th response) until 
we decided if we were happy with it.

marking classes, plugins, and internal APIs Experimental can be done 
with javadocs.

(if however, you are not certian that the response format from a DIH 
request handler is fully baked, that would be a good place to add a call 
to addExperimentalFormatWarning)

: On Fri, Aug 29, 2008 at 2:53 AM, Yonik Seeley [EMAIL PROTECTED] wrote:
: 
:  Here are the current handlers that add this warning:
:  $ find . -name \*.java | xargs grep addExperimentalFormatWarning
:  ./handler/admin/LukeRequestHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning( rsp );
:  ./handler/admin/PluginInfoHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning( rsp );
:  ./handler/admin/ShowFileRequestHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning(rsp);
:  ./handler/admin/SystemInfoHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning( rsp );
:  ./handler/admin/ThreadDumpHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning( rsp );
:  ./handler/AnalysisRequestHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning(rsp);
:  ./handler/MoreLikeThisHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning( rsp );
:  ./handler/RequestHandlerUtils.java:  public static void
:  addExperimentalFormatWarning( SolrQueryResponse rsp )
:  ./handler/XmlUpdateRequestHandler.java:
:  RequestHandlerUtils.addExperimentalFormatWarning( rsp );
: 
:  Which ones are actually still more in flux (or that we're not happy
:  with and plan on changing in 1.4 perhaps)?
: 
:  -Yonik
: 
: 
: 
: 
: -- 
: Regards,
: Shalin Shekhar Mangar.
: 



-Hoss



[jira] Assigned: (SOLR-740) legacy gettableFiles support not working in 1.3.0-RC1

2008-08-29 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned SOLR-740:
-

Assignee: Hoss Man

 legacy gettableFiles support not working in 1.3.0-RC1
 -

 Key: SOLR-740
 URL: https://issues.apache.org/jira/browse/SOLR-740
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.3


 i haven't had a chance to drill into this yet to figure out what's going 
 wrong (i know we already had an issue about this that was fixed), but testing 
 the example configs from 1.2 against 1.3rc1 i'm seeing the following 
 behavior...
 # the following message is logged...{code}
 Aug 29, 2008 12:41:31 PM org.apache.solr.core.SolrCore initDeprecatedSupport
 WARNING: adding ShowFileRequestHandler with hidden files: [.SVN, XSLT, 
 SYNONYMS.TXT, PROTWORDS.TXT, STOPWORDS.TXT, SCRIPTS.CONF]
 {code} (Note: that is not the list of files configured in the 1.2 example 
 gettableFiles, it's the list of all files in the solr/conf dir ... and for 
 some reason they are all uppercased)
 # links on the admin screen for the schema and config files point to...{code}
 file/?file=schema.xml   ...ie...  
 http://localhost:8983/solr/admin/file/?file=schema.xml
 file/?file=solrconfig.xml  ...ie... 
 http://localhost:8983/solr/admin/file/?file=solrconfig.xml
 {code}
 # those links don't work (404, nothing seems to be logged by Solr)
 # the legacy form of the urls using get-files.jsp (which people might have 
 bookmarked) *do* in fact work...{code}
 http://localhost:8983/solr/admin/get-file.jsp?file=solrconfig.xml
 http://localhost:8983/solr/admin/get-file.jsp?file=schema.xml
 {code}...but based on the whitespace at the top of the files, i suspect that 
 is relaly the JSP getting used, not the  ShowFileRequestHandler
 To reproduce:
 # checkout the the solr 1.2 tag.
 # copy the 1.3-rc1 war on top of the 1.2 example/webapps/solr.war
 # run the 1.2 example code as normal (java -jar start.jar)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-740) legacy gettableFiles support not working in 1.3.0-RC1

2008-08-29 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-740.
---

Resolution: Fixed

trunk r690430
1.3 branch r690432


 legacy gettableFiles support not working in 1.3.0-RC1
 -

 Key: SOLR-740
 URL: https://issues.apache.org/jira/browse/SOLR-740
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 1.3


 i haven't had a chance to drill into this yet to figure out what's going 
 wrong (i know we already had an issue about this that was fixed), but testing 
 the example configs from 1.2 against 1.3rc1 i'm seeing the following 
 behavior...
 # the following message is logged...{code}
 Aug 29, 2008 12:41:31 PM org.apache.solr.core.SolrCore initDeprecatedSupport
 WARNING: adding ShowFileRequestHandler with hidden files: [.SVN, XSLT, 
 SYNONYMS.TXT, PROTWORDS.TXT, STOPWORDS.TXT, SCRIPTS.CONF]
 {code} (Note: that is not the list of files configured in the 1.2 example 
 gettableFiles, it's the list of all files in the solr/conf dir ... and for 
 some reason they are all uppercased)
 # links on the admin screen for the schema and config files point to...{code}
 file/?file=schema.xml   ...ie...  
 http://localhost:8983/solr/admin/file/?file=schema.xml
 file/?file=solrconfig.xml  ...ie... 
 http://localhost:8983/solr/admin/file/?file=solrconfig.xml
 {code}
 # those links don't work (404, nothing seems to be logged by Solr)
 # the legacy form of the urls using get-files.jsp (which people might have 
 bookmarked) *do* in fact work...{code}
 http://localhost:8983/solr/admin/get-file.jsp?file=solrconfig.xml
 http://localhost:8983/solr/admin/get-file.jsp?file=schema.xml
 {code}...but based on the whitespace at the top of the files, i suspect that 
 is relaly the JSP getting used, not the  ShowFileRequestHandler
 To reproduce:
 # checkout the the solr 1.2 tag.
 # copy the 1.3-rc1 war on top of the 1.2 example/webapps/solr.war
 # run the 1.2 example code as normal (java -jar start.jar)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-741) Add support for rounding dates in DateField

2008-08-29 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12627176#action_12627176
 ] 

Hoss Man commented on SOLR-741:
---

I propose a lot of more general things -- but I'm also a fan of simple, 
direct, specific enhancements to solve common problems.  I'm on board with 
adding support for something like this directly to DateField.

Reusing the DateMathParser syntax makes a lot of sense -- it has a lot of 
flexibility and should already be familiar to people doing non trivial things 
with DateField.  Calling it round or roundTo seems like it would pigeon 
hole it a bit ... perhaps forceMath or appendMath or mutate or something 
that better conveys the idea of general modification made to all dates

The downsides: 
# it has no simple syntax for round up but it can be expressed somewhat 
verbosely (+1DAY-1MILLI/DAY rounds up to the nearest day) 
# it has no notion of round to the nearest 5 minutes which some people might 
expect

...but honestly, those could easily be added as new features to DateMathParser  
-- and then they'd benefit this issue as well as general Date Math usages in 
queries (like date faceting)

syntax wise: perhaps \FOO could be the round up equivalent of /FOO ? ... 
with /nFOO and \nFOO being the round down/up to the nearest nth value for 
unit FOO ?

 Add support for rounding dates in DateField
 ---

 Key: SOLR-741
 URL: https://issues.apache.org/jira/browse/SOLR-741
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


 As discussed at http://www.nabble.com/Rounding-date-fields-td19203108.html
 Since rounding dates to a coarse value is an often recommended solution to 
 decrease number of unique terms, we should add support for doing this in 
 DateField itself. A number of syntax were proposed, some of them were:
 # fieldType name=date class=solr.DateField 
 sortMissingLast=trueomitNorms=true roundTo=-1MINUTE / (Shalin)
 # fieldType name=date class=solr.DateField sortMissingLast=true 
 omitNorms=true round=DOWN_MINUTE / (Otis)
 Hoss proposed more general enhancements related to arbitary pre-processing of 
 values prior to indexing/storing using pre-processing analyzers.
 This issue aims to build a consensus on the solution to pursue and to 
 implement that solution inside Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.