Chris,
Thanks for your feedback.
Yes, I see that there is a lot of demand for a more customizable
full-text index. Did you already try to build some additional index
databases, based on the rules you were listing here? It's not as
comfortable as a tightly coupled full-text index, but the more use
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Hi,
I just want to say that for the dictionary that I used BaseX for,
having a multi-lingual full text would have been very nice. Bar that
a partial index based on certain rules the user supplies would have
also been nice. For instance, being able
Thank you. I tried it out, and now everything works fine.
Kind regards,
Goetz
-Ursprüngliche Nachricht-
Von: Christian Grün [mailto:christian.gr...@gmail.com]
Gesendet: Mittwoch, 22. April 2015 13:22
An: Goetz Heller
Cc: BaseX
Betreff: Re: [basex-talk] Creation of Full-Text-Index failed
...enjoy the fixed version [1].
Christian
[1] http://files.basex.org/releases/latest
On Tue, Apr 21, 2015 at 8:56 PM, Goetz Heller wrote:
> For the task at hand I need to create a database on a daily base from file
> packages I received. The language taken here is German, however the files
> co
Hi Erol,
I am not volunteering :-) but if somebody wants to take this route this
code might give some pointers [1].
It uses Apache Spark to run Saxon-HE, an XQuery example [2], and more info
[3].
/Andy
[1] https://github.com/elsevierlabs/spark-xml-utils
[2] https://github.com/elsevierlabs/spark
Thanks, Fabrice!
Ill work it out.
Kind regards,
Goetz
Von: Fabrice Etanchaud [mailto:fetanch...@questel.com]
Gesendet: Mittwoch, 22. April 2015 11:32
An: Goetz Heller; basex-talk@mailman.uni-konstanz.de
Betreff: RE: [basex-talk] multi-language full-text indexing
Great, Goetz !
A last
OK. Let me do my stuff first. Then I will see if I'm able to dive deep enough
into the BaseX code to come up with some meaningful contribution!
Kind regards,
Goetz
-Ursprüngliche Nachricht-
Von: Christian Grün [mailto:christian.gr...@gmail.com]
Gesendet: Mittwoch, 22. April 2015 11:15
Reminds me of an old GitHub issue.. I have added a link to your
request: https://github.com/BaseXdb/basex/issues/59.
On Wed, Apr 22, 2015 at 11:35 AM, Goetz Heller wrote:
> Here's another addendum: Even if multi-language full-text indexing is not
> going tob e implemented in the near future, it
Here's another addendum: Even if multi-language full-text indexing is not going
tob e implemented in the near future, it still would be a useful feature to be
able to restrict full-text indexing to parts of a document, e.g.
CREATE FULL-TEXT INDEX ON DATABASE XY STARTING WITH (
(path_a)/
Great, Goetz !
A last thing :
If you need to rebuild the original document from parts, be sure to have a way
to retrieve them all (by document path, attribute index, or separate index
collection with node-id/pre values).
If disk space is not an issue, you could store the original document as it
Fabrice,
For the time being, this sounds quite nice. Id to split up the files in
some common part and a set of satellites, one satellite for each language
present in the document.
Thanks!
Kind regards,
Goetz
Von: Fabrice Etanchaud [mailto:fetanch...@questel.com]
Gesendet: Mittwoch, 22.
In a nutshell: It would take some more time to explain all the
implications.. We know that there are various non-trivial issues to be
solved, as we already thought about adding such an index some years
ago.
Cheers,
Christian
On Wed, Apr 22, 2015 at 11:15 AM, Goetz Heller wrote:
> The case you d
The case you described should be made a non-issue:
If a multi-language full-text index was created then it was surely intended to
execute searches within the confines of a specific language. Hence, if none was
specified in the query, a runtime error should be thrown in such cases.
Kind regards,
Hi Götz (cc @ basex-talk),
> OK, I think I understand. However, I think there should be some
possibilities to allow the user to give hints. In my opinion,
FOR-loops would be first-class candidates to use parallel streams, in
particular in the use case I described in my previous posting:
>
> FOR $
Any volunteers out there? ;)
On Wed, Apr 22, 2015 at 11:05 AM, Erol Akarsu wrote:
> Christian,
>
> I think we should be able to attach BaseX to Apache spark. But integration
> code need to be written.
> Everybody is able to read from Hadoop,SOLR, ElasticSearch etc. to Spark and
> process there.
Christian,
I think we should be able to attach BaseX to Apache spark. But integration
code need to be written.
Everybody is able to read from Hadoop,SOLR, ElasticSearch etc. to Spark and
process there.
Why not for BaseX?
Erol Akarsu
On Wed, Apr 22, 2015 at 4:28 AM, Christian Grün
wrote:
> Hi G
Dear Goetz,
I have the same requirement (patent documents containing text in different
languages).
I ended up splitting/filtering each original document in localized parts
inserted in different collections (each collection having its own full text
index configuration).
BaseX is as flexible as o
> It is desirable to have
> documents indexed by locale-specific parts, e.g.
I can see that this would absolutely make sense, but it would be quite
some effort to realize it. There are also various conceptul issues
related to XQuery Full Text: If you don't specify the language in the
query, we'd n
Hello,
Excellent. Glad to be of use.
I'll try the new snapshot right away.
Cheers
Simon
On Wed, Apr 22, 2015 at 10:06 AM, Christian Grün
wrote:
> Hi Simon,
>
> I finally had time to look at your examples, and...
>
> > One more detail: [...]
>
> ...seemed to fix it! The original version of th
I'm working with documents destined to be consumed anywhere in the European
Community. Many of them have the same tags multiple times but with a
different language attribute. It does not make sense to create a full-text
index for the whole of these documents therefore. It is desirable to have
docum
Hi Götz,
> it would
> make perfect sense to parallelize the query. Is there a way to achieve this
> using xQuery?
Our initial attempts to integrate low-level support for
parallelization in XQuery turned out not to be as successful as we
hoped they would be. One reason for that is that you can bas
Hi Simon,
I finally had time to look at your examples, and...
> One more detail: [...]
...seemed to fix it! The original version of this class was written by
Jens (in the cc), but I also believe that the basic problem was that
the locks instance was not synchronized. In my fix, I used a
Concurre
Hi Marc,
> "If the %rest:produces annotation is specified, a function will
> only be invoked if the HTTP Accept header of the request matches one
> of the given types, or if it does not specify any HTTP Accept header at all."
I asked Adam a while ago to get the online version of the spec
upda
Hi Christian,
You are right, foolish of me not to verify on latest or even on 8.1
were this was fixed already. I was hitting an API that was part of our
software which used an 8.0 version still.
Just verified it on 8.1 and latest snapshot and there it's fine.
One nitpick for the RESTXQ spec thou
24 matches
Mail list logo