No, that makes it non-portable, provider-specific.  That is, I can't
cut-n-paste that from one graph db to the next.  Also, wouldn't you need to
configure a serializer for 'DseGraph.searchType'?

I think we can start with a small, simple set.

startsWith(String)
startsWith(Search,String)
contains(String)
contains(Search, String)
regex(String)
regex(Search, String)

Each takes Search, String.  Where Search is an enum of String (default),
Text (tokenized).  String is the search term.

The regex syntax may be provider-specific, but the traversal would be
portable. If the provider doesn't override the step/predicate then it would
use the default implementation.

Then all that's left is fuzzy.  I don't have an opinion on that yet. Maybe
it's more Search enums?

Robert Dale


On Thu, Jun 21, 2018 at 3:04 PM Stephen Mallette <spmalle...@gmail.com>
wrote:

> Just thinking out loud here, but i wonder if we could keep our predicate
> list more or less as-is, but then use with() to modulate a has() to be
> provider specific:
>
> g.V().
>    has('longText',eq("a.*").
>      with(DseGraph.searchType, tokenRegex)
>
> In other words, this would be the standard way that users would inform
> graph providers to handle special text search types. The upside is that
>
> 1. graph providers no longer have to hassle with serialization at all to
> implement this (which means users don't need special configuration of their
> servers/drivers).
> 2. we have a common way that all graph providers can take advantage of and
> thus users have one method for writing their gremlin (albeit with different
> with() and search syntax).
> 3. we can make this part of our reference implementation i think pretty
> easily for TinkerGraph with some basic java regex stuff.
> 4. stays backward compatible with existing graph provider predicates
>
> good idea?
>
>
>
> On Mon, Jun 11, 2018 at 9:12 AM Stephen Mallette <spmalle...@gmail.com>
> wrote:
>
> > I found a CosmosDB issue on github calling for support of text predicates
> >
> > https://github.com/Azure/azure-documentdb-dotnet/issues/473
> >
> > and it conveniently listed the text predicates for a number of different
> > graphs, so it made the job of compiling these pretty easy.
> >
> > DSE Graph (tokenized search is for long multi-sentence type properties)
> > + eq/neq
> > + prefix
> > + regex
> > + token
> > + tokenPrefix
> > + tokenRegex
> > + phrase
> > + fuzzy
> > + tokenFuzzy
> >
> >
> >
> https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/graph/using/useSearchIndexes.html
> >
> > JanusGraph
> > + textContains
> > + textContainsPrefix
> > + textContainsRegex
> > + textContainsFuzzy
> > + eq/neq
> > + textPrefix
> > + textRegex
> > + textFuzzy
> >
> > http://docs.janusgraph.org/latest/index-parameters.html#text-search
> >
> > Neo4j/Cypher
> > + STARTS WITH
> > + ENDS WITH
> > + CONTAINS
> >
> >
> http://www.jexp.de/blog/html/full-text-and-spatial-search-in-neo4j-3.html
> >
> > OrientDB - basically just lucene syntax
> > + LUCENE
> >
> > https://orientdb.com/docs/last/Full-Text-Index.html
> >
> > So - that's the list as best I can determine. JanusGraph and DSE Graph
> > have the most complex set of expressions it seems. Neo4j/Cypher has the
> > easiest developer friendly looking set that probably covers most of the
> > questions we get out in the community. OrientDB gets vendor specific in
> > what they do.  Did I leave any out - please update this thread if I did.
> >
> > Not sure what we do with that now, but that's what is out there.
> >
> >
> >
> >
> >
> >
> >
> >
>

Reply via email to