On Dec 5, 2005, at 9:07 PM, Yonik Seeley wrote:
There is one little problem with XML though... It's inability to
directly represent binary data, or even all unicode code points (no,
entities don't fix this). I use binary data in lucene to represent
some numerics, and that can't be represented
On Dec 5, 2005, at 9:18 PM, Yonik Seeley wrote:
If we go with XML, I think this must be solved (or else we are at the
point where we can only represent a subset of queries that lucene can
handle again).
Hmmm, maybe it's not quite so serious if the XML represents a
pre-analyzed query vs post-a
> but we should also allow for the client to push the
> analysis
> responsibility to the server:
Yet another variation we could support is to use the
existing QueryParser server-side for handling
user-typed input. On the client user input is unparsed
and combined with the lower-level constraints
One thing I like about the possibility of XML (as opposed to other
syntax) is that I could create query templates and process them with
XSLT. And I can do this client side and also in most modern browsers.
-
To unsubscribe, e-m
On 12/6/05, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> > example: � is not valid XML
> Can you give an example of a query that needs binary information?
It's never an absolute need - one could always work around the
problem, for sure. The issue was more a desire to be able to
represent everything
On 12/6/05, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> Suppose a user of the Swing or RoR client enters "some phrase", who
> is responsible for analyzing that phrase so that it is suitable for
> PhraseQuery.add()? Right?
Right, and even more. The query one specifies may be morphed into
another ty
Yonik Seeley wrote:
On 12/6/05, Erik Hatcher <[EMAIL PROTECTED]> wrote:
Also I'd be curious to see a problem with Unicode code points in XML,
if you have one handy.
The definition of valid XML 1.0 characters:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x1-#x10]
The simplest
SearchBlox Software has released Version 3.1 of its J2EE Content Search
Software. SearchBlox delivers out-of-the-box search functionality for quick
and easy integration with websites, applications, intranets and portals.
SearchBlox uses the Lucene Search API and incorporates integrated
HTTP/HTTPS,
I certainly agree with this change!
We all have interest in having a clean and understandable API.
Erik Hatcher
<[EMAIL PROTECTED]
> Are you aware, though, of an existing Unicode serialization/markup
> mechanism without XML's gaps?
No, but I'm not advocating anything other than XML. I'm just pointing
out a problem that needs to be solved.
> Base64 is frequently used as an escape mechanism for binary data in XML.
Yeah, but
Maybe, I'm a bit late with this, but.
There is an ongoing effort at w3c to define a fulltext
search language that could extend their xpath and xquery
languages (which clearly makes sense).
These are the current documents on the topic:
http://www.w3.org/TR/2005/WD-xquery-full-text-20051103/
http:
That's basically what I'm implementing with Nux, except that the
syntax and calling conventions are a bit different, and that Lucene
analyzers can optionally be specified, which makes it a lot more
powerful (but also a bit more complicated).
Wolfgang.
On Dec 6, 2005, at 10:48 AM, Incze Laj
On Tuesday 06 December 2005 03:20, Chris Hostetter wrote:
...
>
> I can think of at least two big use cases that I'm concerned about
>
> 1) Human creation
...
>
> 2) Aliasing
>
...
Meanwhile I scratched some surface off XSL, and I think it can allow
both simplification and aliasing in one
Build an index which allows me to broswe by category.
-
Key: LUCENE-477
URL: http://issues.apache.org/jira/browse/LUCENE-477
Project: Lucene - Java
Type: Task
Components: Index
Versions: 1.4
Environment
Yonik wrote:
For normal text data, with valid unicode characters that aren't legal
XML, I'd rather have a simple escaping mechanism. Something like
backslash escaping that is easily understood. Maybe something as
simple as \00 for � (backslash followed by two hex digits).
I agree with your go
For normal text data, with valid unicode characters that aren't legal
XML, I'd rather have a simple escaping mechanism. Something like
backslash escaping that is easily understood. Maybe something as
simple as \00 for � (backslash followed by two hex digits).
Similar RFC for an extension to XM
[
http://issues.apache.org/jira/browse/LUCENE-477?page=comments#action_12359495 ]
Hoss Man commented on LUCENE-477:
-
This isn't a "bug" or a "feature" or a "task" as much as it is a "question"
about using lucene in a particular way. Questions generally rec
[ http://issues.apache.org/jira/browse/LUCENE-477?page=all ]
Erik Hatcher closed LUCENE-477:
---
Resolution: Invalid
Yes, please bring this topic to the user list rather than JIRA
> Build an index which allows me to broswe by category.
> ---
18 matches
Mail list logo