[jira] Commented: (SOLR-232) let Solr set request headers (for logging)

2007-05-10 Thread Ian Holsman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494918
 ] 

Ian Holsman commented on SOLR-232:
--

Hi Otis.

The UI would be a generic monitoring tool similar to 
http://pyro.holsman.net:8000/ganglia/?m=atomics_rows_mean&c=atomics.
and yes.. putting a error message would be much more helpful. The code was more 
to show how it could be done. there needs to be some more meta values added.


other uses I was planning on are:
- log file reply for performance testing
- top N queries
- top N errors
- capacity planning / finding poorly performing queries etc etc

> let Solr set request headers (for logging)
> --
>
> Key: SOLR-232
> URL: https://issues.apache.org/jira/browse/SOLR-232
> Project: Solr
>  Issue Type: New Feature
> Environment: tomcat?
>Reporter: Ian Holsman
>Priority: Minor
> Attachments: meta.patch
>
>
> I need the ability to log certain information about a request so that I can 
> feed it into performance and capacity monitoring systems.
> I would like to know things like
> - how long the request took 
> - how many rows were fetched and returned
> - what handler was called.
> per request.
> the following patch is 1 way to implement this, I'm sure there are better 
> ways.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [Fwd: [Jetty-support] Stable Release 6.1.2]

2007-05-10 Thread Mike Klaas

On 10-May-07, at 4:46 PM, Ryan McKinley wrote:


Otis Gospodnetic wrote:
I haven't moved to 6.* yet.  But I did notice 6.1.3 showed up a  
few days ago.


I think 6.1.3 showed up a day after 6.1.2!


That's always an encouraging sign :)

-Mike


Re: [Fwd: [Jetty-support] Stable Release 6.1.2]

2007-05-10 Thread Ryan McKinley

Otis Gospodnetic wrote:

I haven't moved to 6.* yet.  But I did notice 6.1.3 showed up a few days ago.



I think 6.1.3 showed up a day after 6.1.2!

I just posted a jar with the contents of /example on
https://issues.apache.org/jira/browse/SOLR-128

I'm only using it for very simple things, but its worked well so far.

ryan



Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Yonik Seeley <[EMAIL PROTECTED]>
To: solr-dev@lucene.apache.org
Sent: Wednesday, May 9, 2007 11:53:49 PM
Subject: Re: [Fwd: [Jetty-support] Stable Release 6.1.2]

On 5/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:

should we consider including this?


+1 to Jetty 1.2
Otis, any observations yet?

Longer term, it would be nice to have an external test suite that one
could point at a real solr server and verify it is configured
correctly:
 - UTF8 in URLs, etc
 - load testing: no resource leaks, deadlocks, etc
 - update concurrency testing - test synchronization under heavy load

-Yonik








[jira] Updated: (SOLR-128) Include Newer version of Jetty

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-128:
---

Attachment: jetty-6.3-example.zip

here is a zip with the example directory as we would (maybe) want it.

This uses jetty-6.1.3



> Include Newer version of Jetty
> --
>
> Key: SOLR-128
> URL: https://issues.apache.org/jira/browse/SOLR-128
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Reporter: Ryan McKinley
>Priority: Minor
> Attachments: jetty-6.3-example.zip, Jetty6.config.patch, lib.zip, 
> start.jar
>
>
> It would be good to include an up-to-date jetty version for the example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [Fwd: [Jetty-support] Stable Release 6.1.2]

2007-05-10 Thread Otis Gospodnetic
I haven't moved to 6.* yet.  But I did notice 6.1.3 showed up a few days ago.

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Yonik Seeley <[EMAIL PROTECTED]>
To: solr-dev@lucene.apache.org
Sent: Wednesday, May 9, 2007 11:53:49 PM
Subject: Re: [Fwd: [Jetty-support] Stable Release 6.1.2]

On 5/2/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> should we consider including this?

+1 to Jetty 1.2
Otis, any observations yet?

Longer term, it would be nice to have an external test suite that one
could point at a real solr server and verify it is configured
correctly:
 - UTF8 in URLs, etc
 - load testing: no resource leaks, deadlocks, etc
 - update concurrency testing - test synchronization under heavy load

-Yonik





[jira] Commented: (SOLR-232) let Solr set request headers (for logging)

2007-05-10 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494899
 ] 

Otis Gospodnetic commented on SOLR-232:
---

Interesting.  Does it have a UI piece?  Do you need that "start" variable?  
Looks like you could just use "i".  Keeping track of "ERR" is useful, but it 
may be more useful to keep track of the actual errors.  Maybe something as 
simple as e.getMessage()?


> let Solr set request headers (for logging)
> --
>
> Key: SOLR-232
> URL: https://issues.apache.org/jira/browse/SOLR-232
> Project: Solr
>  Issue Type: New Feature
> Environment: tomcat?
>Reporter: Ian Holsman
>Priority: Minor
> Attachments: meta.patch
>
>
> I need the ability to log certain information about a request so that I can 
> feed it into performance and capacity monitoring systems.
> I would like to know things like
> - how long the request took 
> - how many rows were fetched and returned
> - what handler was called.
> per request.
> the following patch is 1 way to implement this, I'm sure there are better 
> ways.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-225) Allow pluggable Highlighting classes -- Formatters and Fragmenters

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-225:
---

Attachment: SOLR-225-HighlightingConfig.patch

no real changes... it applies cleanly with trunk.

> The plugin architecture seems like something tha could be made more general 
> than just plugins. 

It is a 95% duplicate of the RequestHandler plugin architecture.  The only 
reason it could not be identical was the lazy loading request handlers...

Currently solr has two plugin initialization types: 
1. init( NamedList args )
2. init( Map args)

If we added an interface for each initalization type, we could probably do all 
plugin initalization with something like the PluginLoader class in this patch

class PluginLoader
{
 public Map load( NodeList nodes ) {
   ...
 }
}




> Allow pluggable Highlighting classes -- Formatters and Fragmenters
> --
>
> Key: SOLR-225
> URL: https://issues.apache.org/jira/browse/SOLR-225
> Project: Solr
>  Issue Type: Improvement
>Reporter: Brian Whitman
> Attachments: SOLR-225-HighlightingConfig.patch, 
> SOLR-225-HighlightingConfig.patch
>
>
> Highlighting should support a pluggable architecture similar to what is seen 
> with RequestHandlers, Fields, FieldTypes, etc
> '
> For more background:
> http://www.nabble.com/Custom-fragmenter-tf3681588.html#a10289335

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-231) By default, use UTF-8 for posted content streams

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-231.


Resolution: Fixed

added in 537024

> By default, use UTF-8 for posted content streams
> 
>
> Key: SOLR-231
> URL: https://issues.apache.org/jira/browse/SOLR-231
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
> Assigned To: Ryan McKinley
> Fix For: 1.2
>
> Attachments: SOLR-231-ContentType-UTF8.patch, 
> SOLR-231-ContentType-UTF8.patch
>
>
> Solr should assume UTF-8 encoding unless the contentType says otherwise.  To 
> change the contentType and encoding set the header value with contentType 
> ="text/xml; charset=utf-8"
> likewise, with stream.body=, will default to UTF-8 unless the 
> stream.contentType says otherwise.
>  
> For previous discussion, see:
> http://www.nabble.com/resin-and-UTF-8-in-URLs-tf3152910.html
> http://www.nabble.com/charset-in-POST-from-browser-tf3153057.html
> http://www.nabble.com/Re%3A-svn-commit%3A-r536048lucene-solr-trunk-src-webapp-src-org-apache-solr-servlet-SolrRequestParsers.java-tf3712816.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-231) By default, use UTF-8 for posted content streams

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley reassigned SOLR-231:
--

Assignee: Ryan McKinley

> By default, use UTF-8 for posted content streams
> 
>
> Key: SOLR-231
> URL: https://issues.apache.org/jira/browse/SOLR-231
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
> Assigned To: Ryan McKinley
> Fix For: 1.2
>
> Attachments: SOLR-231-ContentType-UTF8.patch, 
> SOLR-231-ContentType-UTF8.patch
>
>
> Solr should assume UTF-8 encoding unless the contentType says otherwise.  To 
> change the contentType and encoding set the header value with contentType 
> ="text/xml; charset=utf-8"
> likewise, with stream.body=, will default to UTF-8 unless the 
> stream.contentType says otherwise.
>  
> For previous discussion, see:
> http://www.nabble.com/resin-and-UTF-8-in-URLs-tf3152910.html
> http://www.nabble.com/charset-in-POST-from-browser-tf3153057.html
> http://www.nabble.com/Re%3A-svn-commit%3A-r536048lucene-solr-trunk-src-webapp-src-org-apache-solr-servlet-SolrRequestParsers.java-tf3712816.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: FederatedSearch and large Lucene distributed indexes

2007-05-10 Thread Mike Klaas

On 10-May-07, at 3:02 PM, Daniel Creão wrote:
So, I tried Solr and read about FederatedSearch and  
CollectionDistribution.
An 'all-machines-have-complete-index' strategy (using rsync) can  
improve

system throughput and concurrency by each station processing different
queries, but each query will spend the same amount of time that a
single-node system (what sucks).


A single-node system _with 1/N the traffic_, sure.

When each of a N-station cluster indexing 1/N of text collection,  
each will
machine spend less time processing queries, but all machines must  
process
the same query at the same time (a 'goodbye, concurrency', IMO),  
then merge

results.


I don't really understand this.

For huge corpora, you must distribute different parts of the index  
over multiple servers.  For high throughput, you must distribute the  
same part of the index over multiple servers.  These are not  
competing strategies, and to solve both problems, both solutions must  
be employed.



Did I get anything wrong (about Hadoop and Solr)?

Is Multiple Masters/FederatedSearch under development? What status?  
Or did I

should develop it for myself?


Implementation of this in Solr is still in the highly theoretical  
stage, so is unlikely to happen any time soon.


You might try Nutch, which is basically an implementation of this  
strategy using Lucene.


-Mike

[jira] Resolved: (SOLR-226) support dynamic fields as copyField destination

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-226.


   Resolution: Fixed
Fix Version/s: (was: 1.3)
   1.2

added in rev 536730

> support dynamic fields as copyField destination
> ---
>
> Key: SOLR-226
> URL: https://issues.apache.org/jira/browse/SOLR-226
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan McKinley
> Assigned To: Ryan McKinley
>Priority: Minor
> Fix For: 1.2
>
> Attachments: SOLR-226-DynamicCopyField.patch, 
> SOLR-226-DynamicCopyField.patch
>
>
> I'd like to use a dynamic field as the destination of a copyField:
> Given:
>   
>   
> I want:
>
> For background see:
> http://www.nabble.com/copyField-to-a-dynamic-field-tf2300115.html#a6419101
> http://www.nabble.com/dynamic-copyFields-tf3683816.html#a10296520

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-224) PhoneticFilterFactory -- support Metaphone/Soundex filters

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-224.


Resolution: Fixed

in rev 537014

> PhoneticFilterFactory -- support Metaphone/Soundex filters
> --
>
> Key: SOLR-224
> URL: https://issues.apache.org/jira/browse/SOLR-224
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ryan McKinley
> Assigned To: Ryan McKinley
>Priority: Minor
> Attachments: commons-codec-1.3.jar, 
> SOLR-224-PhoneticFilterFactory.patch, SOLR-224-PhoneticFilterFactory.patch
>
>
> A simple FilterFactory to replace or inject terms encoded with commons codec 
> functions:
> http://jakarta.apache.org/commons/codec/api-release/org/apache/commons/codec/language/package-summary.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



FederatedSearch and large Lucene distributed indexes

2007-05-10 Thread Daniel Creão

I'm looking forward some strategy to build a large Lucene distributed index,
but after read a lot of mail threads, it seems that Lucene hasn't a 'default
solution' for that yet.

The search engine that I'm building has a 400 GB text-database (right now,
growing every day and without document deleting operation) and thousands
queries per day (a very small time of response is needed). I expected to use
a commodity computers cluster as hardware infrastructure and the distributed
strategy must allow concurrency/parallel processing/searching with all
processors.

Searching on Lucene site, I thought Hadoop was the right choice. After read
some mail threads about this (at java lucene and hadoop, dev and user), it
seems that Hadoop isn't that good (let's put this way) to deal with
distributed indexes for a search engine.

So, I tried Solr and read about FederatedSearch and CollectionDistribution.
An 'all-machines-have-complete-index' strategy (using rsync) can improve
system throughput and concurrency by each station processing different
queries, but each query will spend the same amount of time that a
single-node system (what sucks).

When each of a N-station cluster indexing 1/N of text collection, each will
machine spend less time processing queries, but all machines must process
the same query at the same time (a 'goodbye, concurrency', IMO), then merge
results.

Neither of them looks good for me.

The 'Multiple Masters' (described on FederatedSearch) solution looks good.
"There could be a master for each slice of the index. An external module
could provide the update interface and forward the request to the correct
master based on the unique key field". That's great, has high scalability,
high concurrency, smaller index for each node... I read some papers with
very similar solution [1][2][3] and discussing how to solve some of this
strategy issues.

Did I get anything wrong (about Hadoop and Solr)?

Is Multiple Masters/FederatedSearch under development? What status? Or did I
should develop it for myself?

Thanks for any help.
Daniel

[1] B. Ribeiro-Neto, J. Kitajima, G. Navarro e N. Ziviani. Parallel
Generation of Inverted Files for Distributed Text Collections. In
Proceedings of the XVIII international Conference of the Chilean Computer
Science Society. IEEE Computer Society Washington, 1998.

[2] C. Badue, R. Baeza-Yates, B. Ribeiro-Neto e N. Ziviani. Distributed
Query Processing Using Partitioned Inverted Files. Eighth Symposium on
String Processing and Information Retrieval (SPIRE'01), 2001.

[3] C. Badue, R. Barbosa, B. Ribeiro-Neto e N. Ziviani. Basic issues on the
processing of web queries. In Proceedings of the 28th Annual international
ACM SIGIR Conference on Research and Development in information Retrieval,
2005.


[jira] Assigned: (SOLR-224) PhoneticFilterFactory -- support Metaphone/Soundex filters

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley reassigned SOLR-224:
--

Assignee: Ryan McKinley

> PhoneticFilterFactory -- support Metaphone/Soundex filters
> --
>
> Key: SOLR-224
> URL: https://issues.apache.org/jira/browse/SOLR-224
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ryan McKinley
> Assigned To: Ryan McKinley
>Priority: Minor
> Attachments: commons-codec-1.3.jar, 
> SOLR-224-PhoneticFilterFactory.patch, SOLR-224-PhoneticFilterFactory.patch
>
>
> A simple FilterFactory to replace or inject terms encoded with commons codec 
> functions:
> http://jakarta.apache.org/commons/codec/api-release/org/apache/commons/codec/language/package-summary.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-233) Add UTF-8 support to example.xsl

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-233.


Resolution: Fixed

changed the example.xls to have:

   

in rev 537004

> Add UTF-8 support to example.xsl
> 
>
> Key: SOLR-233
> URL: https://issues.apache.org/jira/browse/SOLR-233
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.1.0
> Environment: all
>Reporter: KuroSaka TeruHiko
> Assigned To: Ryan McKinley
>
> If conf/xslt/example.xsl is applied to non-ASCII characters such as Arabic, 
> the output gets garbled, because the output encoding is not properly 
> specified.
> The xsl:output element in example.xsl needs to be modified as suggested in 
> the following email:
> From: Brian Whitman 
> Sent: Thursday, May 10, 2007 1:19 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Does Solr XSL writer work with Arabic text?
> In example.xsl change the output type
>
> to
>
> And see if that helps. I had the same problem (different language.)  
> If this works we should file a JIRA to fix it up in trunk.
> On May 10, 2007, at 4:13 PM, Teruhiko Kurosaka wrote:
> > I'm trying to search an index of docs which have text fields in  
> > Arabic,
> > using XSL writer (wt=xslt&tr=example.xsl).  But the Arabic text gets
> > all garbled.  Is XSL writer known to work for Arabic text? Is anybody
> > using it?
> >
> > -kuro

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-233) Add UTF-8 support to example.xsl

2007-05-10 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley reassigned SOLR-233:
--

Assignee: Ryan McKinley

> Add UTF-8 support to example.xsl
> 
>
> Key: SOLR-233
> URL: https://issues.apache.org/jira/browse/SOLR-233
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.1.0
> Environment: all
>Reporter: KuroSaka TeruHiko
> Assigned To: Ryan McKinley
>
> If conf/xslt/example.xsl is applied to non-ASCII characters such as Arabic, 
> the output gets garbled, because the output encoding is not properly 
> specified.
> The xsl:output element in example.xsl needs to be modified as suggested in 
> the following email:
> From: Brian Whitman 
> Sent: Thursday, May 10, 2007 1:19 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Does Solr XSL writer work with Arabic text?
> In example.xsl change the output type
>
> to
>
> And see if that helps. I had the same problem (different language.)  
> If this works we should file a JIRA to fix it up in trunk.
> On May 10, 2007, at 4:13 PM, Teruhiko Kurosaka wrote:
> > I'm trying to search an index of docs which have text fields in  
> > Arabic,
> > using XSL writer (wt=xslt&tr=example.xsl).  But the Arabic text gets
> > all garbled.  Is XSL writer known to work for Arabic text? Is anybody
> > using it?
> >
> > -kuro

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Field collapsing functionality

2007-05-10 Thread Otis Gospodnetic
Emmanuel,

This sounds useful!
Here is everything you'll need to know: 
http://wiki.apache.org/solr/HowToContribute

Thanks,
Otis 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Emmanuel Keller <[EMAIL PROTECTED]>
To: solr-dev@lucene.apache.org
Sent: Thursday, May 10, 2007 5:19:09 PM
Subject: Field collapsing functionality

Hi,

My name is Emmanuel Keller. I am an engineer working as technical manager
for a french company.

For some projects, I identified that Lucene was the search engine we need.
I worked hard to successfully integrate Solr on the first project:
http://www.usinenouvelle.com/expo

For the next project, (not yet online), I needed a collapse functionality,
described here: http://www.fastsearch.com/glossary.aspx?m=48&amid=299.

Finally, I did it on Solr-1.1.1-dev.

If you are agree, I propose to commit the version for the current trunk
version.

Sincerely yours,
Emmanuel.


P.S.: English is not my native language.


-- 
Emmanuel Keller.
CTO - GISI Interactive
12-14 rue Médéric - 75017 PARIS
tél. : 33 (0)1 56 79 41 30
fax : 33 (0)1 43 80 44 28
mobile : 33 (0)6 84 09 99 05
e.mail : [EMAIL PROTECTED]
http://www.usinenouvelle.com





[jira] Created: (SOLR-233) Add UTF-8 support to example.xsl

2007-05-10 Thread KuroSaka TeruHiko (JIRA)
Add UTF-8 support to example.xsl


 Key: SOLR-233
 URL: https://issues.apache.org/jira/browse/SOLR-233
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.1.0
 Environment: all
Reporter: KuroSaka TeruHiko


If conf/xslt/example.xsl is applied to non-ASCII characters such as Arabic, the 
output gets garbled, because the output encoding is not properly specified.
The xsl:output element in example.xsl needs to be modified as suggested in the 
following email:

From: Brian Whitman 
Sent: Thursday, May 10, 2007 1:19 PM
To: [EMAIL PROTECTED]
Subject: Re: Does Solr XSL writer work with Arabic text?

In example.xsl change the output type

   

to

   


And see if that helps. I had the same problem (different language.)  
If this works we should file a JIRA to fix it up in trunk.




On May 10, 2007, at 4:13 PM, Teruhiko Kurosaka wrote:

> I'm trying to search an index of docs which have text fields in  
> Arabic,
> using XSL writer (wt=xslt&tr=example.xsl).  But the Arabic text gets
> all garbled.  Is XSL writer known to work for Arabic text? Is anybody
> using it?
>
> -kuro


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Field collapsing functionality

2007-05-10 Thread Emmanuel Keller

Hi,

My name is Emmanuel Keller. I am an engineer working as technical manager
for a french company.

For some projects, I identified that Lucene was the search engine we need.
I worked hard to successfully integrate Solr on the first project:
http://www.usinenouvelle.com/expo

For the next project, (not yet online), I needed a collapse functionality,
described here: http://www.fastsearch.com/glossary.aspx?m=48&amid=299.

Finally, I did it on Solr-1.1.1-dev.

If you are agree, I propose to commit the version for the current trunk
version.

Sincerely yours,
Emmanuel.


P.S.: English is not my native language.


--
Emmanuel Keller.
CTO - GISI Interactive
12-14 rue Médéric - 75017 PARIS
tél. : 33 (0)1 56 79 41 30
fax : 33 (0)1 43 80 44 28
mobile : 33 (0)6 84 09 99 05
e.mail : [EMAIL PROTECTED]
http://www.usinenouvelle.com


RE: bug in JSON Response

2007-05-10 Thread Chris Hostetter

: Thanks, I should have checked the wiki.

it's also called out in the "Upgrading from Solr 1.1" section of the
CHANGES.txt file...

http://svn.apache.org/repos/asf/lucene/solr/trunk/CHANGES.txt
...
The JSON response format for facets has changed to make it easier for
clients to retain sorted order.  Use json.nl=map explicitly in clients
to get the old behavior, or add it as a default to the request handler
in solrconfig.xml



-Hoss



Re: Various Ideas from ApacheCon

2007-05-10 Thread J. Delgado

The ever growing presence of mingled structured and unstructured data is a
fact of life and modern systems we have to deal with. Clearly, the tendency
is that full-text indexing is moving towards DB functionality, i.e.
 fields for projection/filtering, sorting, faceted queries,
transactional CRUD operations etc. Though set manipulation is not Lucene's
or Solr's forte, the document-object model maps very well to rows of
relational sets or tables, evermore when CLOBs and TEXT fields where
introduced.

On the other hand, relational databases with XML and OO extensions and
native XML repositories still have to deal with the problem of RANKING
unstructured text and combination of text fragments and structured
conditions, thus  dealing no longer just with a set/relational model  that
yields binary answers but extending their query languages to handled the
concept of fuzziness, relevance, etc. (e.g. SQL/MM, XQuery-FullText).

I would like once again to open this can of worms, and perhaps think out of
the box, without classifying DB and Full-Text as simply different, as we
analyze concepts to further understand the real path for evolution of
Lucene/Sorl

Here is a very interesting attempt to create a special type of "index"
called Domain Index to query unstructured data within Oracle by Marcelo
Ochoa:
https://issues.apache.org/jira/browse/LUCENE-724

Other interesting articles:

XQuery 1.0 - Full-Text:
http://www.w3.org/TR/xquery-full-text/
SQL/MM Full-Text
http://www.wiscorp.com/2CD1R1-02-fulltext-2001-12.pdf

Discussions on *XML data model vs. relational model*
http://www.xml.com/cs/user/view/cs_msg/2645

http://www.w3.org/TR/xpath-datamodel/
http://en.wikipedia.org/wiki/Relational_model

2007/5/9, James liu <[EMAIL PROTECTED]>:


I think the topest thing lucene/solr should do:
1: more easy use and less code
2: distributed index and search
3: manage these index and search server
4: test method or tool

i don't agree

2007/5/8, Grant Ingersoll <[EMAIL PROTECTED]>:Yep, my advice always is
use
a db for what a db is designed for (set
manipulation) and use Lucene for what it is good for

maybe fs+lucene/solr is better


--
regards
jl



RE: bug in JSON Response

2007-05-10 Thread Gunther, Andrew
Thanks, I should have checked the wiki.
Cheers,
Andrew

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Thursday, May 10, 2007 9:32 AM
To: solr-dev@lucene.apache.org
Subject: Re: bug in JSON Response

On 5/10/07, Gunther, Andrew <[EMAIL PROTECTED]> wrote:
> Anyone notice an error in the JSON response when requesting facets.
>
> The response I get is:
>
{"facet_queries":{},"facet_fields":{"subject":["Landscape",10335,"River"
> ,1767,"Mountain",1278,"Architecture",1184] }}
>
> It seems like the JSONArray subject should yield a JSONObject with
> name,value pairs like:
>
>
{"subject":[{"Landscape":10335,"River":1767,"Mountain":1278,"Architectur
> e":1184}] }
>
> I've checked the bug list but nothing is showing up.
> Anyone using JSON?

The default was changed because many JSON clients don't maintain order
of key/value pairs in a JSONObject.

See http://wiki.apache.org/solr/SolJSON

If you wish to get a JSON object for "ordered" key/value pairs like
facet counts,
pass in json.nl=map

-Yonik


Re: bug in JSON Response

2007-05-10 Thread Yonik Seeley

On 5/10/07, Gunther, Andrew <[EMAIL PROTECTED]> wrote:

Anyone notice an error in the JSON response when requesting facets.

The response I get is:
{"facet_queries":{},"facet_fields":{"subject":["Landscape",10335,"River"
,1767,"Mountain",1278,"Architecture",1184] }}

It seems like the JSONArray subject should yield a JSONObject with
name,value pairs like:

{"subject":[{"Landscape":10335,"River":1767,"Mountain":1278,"Architectur
e":1184}] }

I've checked the bug list but nothing is showing up.
Anyone using JSON?


The default was changed because many JSON clients don't maintain order
of key/value pairs in a JSONObject.

See http://wiki.apache.org/solr/SolJSON

If you wish to get a JSON object for "ordered" key/value pairs like
facet counts,
pass in json.nl=map

-Yonik


bug in JSON Response

2007-05-10 Thread Gunther, Andrew
Anyone notice an error in the JSON response when requesting facets.

The response I get is:
{"facet_queries":{},"facet_fields":{"subject":["Landscape",10335,"River"
,1767,"Mountain",1278,"Architecture",1184] }}

It seems like the JSONArray subject should yield a JSONObject with
name,value pairs like:

{"subject":[{"Landscape":10335,"River":1767,"Mountain":1278,"Architectur
e":1184}] }

I've checked the bug list but nothing is showing up.
Anyone using JSON?

Cheers,
Andrew