Re: [hibernate-dev] [HV/HSEARCH] Free form

2017-01-30 Thread Yoann Rodiere
Hi,

Did the same this week-end, and adapted your work to match the bigger
picture of what we discussed on Friday.
Basically the "StructureTraverser" is now called "ValueProcessor", because
it's not responsible for exposing the internals of a structure anymore, but
only to process a structure according to previously defined metadata,
passing the output to the "DocumentContext". I think it's the second option
you suggested. It makes sense in my opinion, since metadata will be defined
differently for different source types (POJO, JSON, ...).
This design allows in particular what Sanne suggested: when
bootstrapping, we can build some kind of "walker" (a composition of
"ValueProcessors") from the metadata, and avoid metadata lookup at runtime.

The snippet is there: https://gist.github.com/yrodiere/
9ff8fe8a8c7f59c1a051b36db20fbd4d

I'm sure it'll have to be refined to address additional constraints, but in
its current state it seems to address all of our requirements.


Yoann Rodière 
Hibernate NoORM Team

On 27 January 2017 at 18:23, Emmanuel Bernard 
wrote:

> I took the flight home to play with free form and specifically how we
> would retrieve data from the free form structure.
> By free-form I mean non POJO but they will have schema (not expressed
> here).
>
> https://github.com/emmanuelbernard/hibernate-search/commit/
> 0bd3fbab137bdad81bfa5b9934063792a050f537
>
> And in particular
> https://github.com/emmanuelbernard/hibernate-
> search/blob/freeform/freeform/src/main/java/org/hibernate/
> freeform/StructureTraverser.java
> https://github.com/emmanuelbernard/hibernate-
> search/blob/freeform/freeform/src/main/java/org/hibernate/
> freeform/pojo/impl/PojoStructureTraverser.java
>
> It probably does not compile, I could not make the build work.
>
> I figured it was important to dump this raw thinking because it will
> influence and will be influenced by the redesign of the DocumentBuilder of
> Hibernate Search.
>
> There are several options for traversing a free form structure
> - expose  the traversing API as a holder to  navigate all properties per
> structure and sub structure. This is what the prototype shows. Caching
> needs to be accessed via a hashmap get or other lookup. Metadata and the
> traversing structure will be navigated in parallel
> - expose a structure that is specialized to a single property or container
> unwrapping aspect. The structures will be spread across and embedded in the
> Metadata
>
>
> Another angle:
> - create a traversable object per payload to carry it (sharing metadata
> info per type)
> - have a stateless traversable object that is provided the payload for
> each access
>
> The former seems better as it does not create a traversable object per
> object navigated.
> The latter is better for payloads that need parsing or are better at
> sequential access since state could be cached.
>
> We need to discuss that and know where DocumentBuilder is going to
> properly design this API.
>
> Emmanuel
> ___
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] [HV/HSEARCH] Free form

2017-02-06 Thread Emmanuel Bernard
Your prototype is very Hibernate Search tainted. I wonder how or whether we 
want it reusable across Hibernate Validator, Search and possibly more.

Have you captured somewhere the discussion about the new document builder so I 
could get a better grip of what’s at bay?
Would this reverse of logic also be embraced by Hibernate Validator? There are 
runtime decisions done in HV during traversal that made me doubt that it would 
be as pertinent.



> On 30 Jan 2017, at 11:21, Yoann Rodiere  wrote:
> 
> Hi,
> 
> Did the same this week-end, and adapted your work to match the bigger picture 
> of what we discussed on Friday.
> Basically the "StructureTraverser" is now called "ValueProcessor", because 
> it's not responsible for exposing the internals of a structure anymore, but 
> only to process a structure according to previously defined metadata, passing 
> the output to the "DocumentContext". I think it's the second option you 
> suggested. It makes sense in my opinion, since metadata will be defined 
> differently for different source types (POJO, JSON, ...).
> This design allows in particular what Sanne suggested: when bootstrapping, we 
> can build some kind of "walker" (a composition of "ValueProcessors") from the 
> metadata, and avoid metadata lookup at runtime.
> 
> The snippet is there: 
> https://gist.github.com/yrodiere/9ff8fe8a8c7f59c1a051b36db20fbd4d 
> 
> 
> I'm sure it'll have to be refined to address additional constraints, but in 
> its current state it seems to address all of our requirements.
> 
> Yoann Rodière mailto:yrodi...@redhat.com>>
> Software Engineer
> Red Hat / Hibernate NoORM Team
> 
> On 27 January 2017 at 18:23, Emmanuel Bernard  > wrote:
> I took the flight home to play with free form and specifically how we would 
> retrieve data from the free form structure.
> By free-form I mean non POJO but they will have schema (not expressed here).
> 
> https://github.com/emmanuelbernard/hibernate-search/commit/0bd3fbab137bdad81bfa5b9934063792a050f537
>  
> 
> 
> And in particular
> https://github.com/emmanuelbernard/hibernate-search/blob/freeform/freeform/src/main/java/org/hibernate/freeform/StructureTraverser.java
>  
> 
> https://github.com/emmanuelbernard/hibernate-search/blob/freeform/freeform/src/main/java/org/hibernate/freeform/pojo/impl/PojoStructureTraverser.java
>  
> 
> 
> It probably does not compile, I could not make the build work.
> 
> I figured it was important to dump this raw thinking because it will 
> influence and will be influenced by the redesign of the DocumentBuilder of 
> Hibernate Search.
> 
> There are several options for traversing a free form structure
> - expose  the traversing API as a holder to  navigate all properties per 
> structure and sub structure. This is what the prototype shows. Caching needs 
> to be accessed via a hashmap get or other lookup. Metadata and the traversing 
> structure will be navigated in parallel
> - expose a structure that is specialized to a single property or container 
> unwrapping aspect. The structures will be spread across and embedded in the 
> Metadata
> 
> 
> Another angle:
> - create a traversable object per payload to carry it (sharing metadata info 
> per type)
> - have a stateless traversable object that is provided the payload for each 
> access
> 
> The former seems better as it does not create a traversable object per object 
> navigated.
> The latter is better for payloads that need parsing or are better at 
> sequential access since state could be cached.
> 
> We need to discuss that and know where DocumentBuilder is going to properly 
> design this API.
> 
> Emmanuel
> ___
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org 
> https://lists.jboss.org/mailman/listinfo/hibernate-dev 
> 
> 

___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] [HV/HSEARCH] Free form

2017-02-07 Thread Gunnar Morling
Emmanuel,

In your PoC, how would a complete tree-like structure be traversed?
It's not clear to me, who is driving StructureTraverser, i.e. which
component will call processSubstructureInContainer() et al. when
traversing an entire tree.

@Yoann, maybe you can add a usage example similar to Emmanuel's? You
have a lot of framework code, but I'm not sure about how it'd be used.

For Hibernate Search, the traversal pattern I implemented for the
ScenicView PoC may be of interest. Its general idea is to represent a
tree traversal as a sequence of events which a traverser
implementation receives and can act on, e.g. to create a corresponding
de-normalized structure, Lucene document etc. The retrieval of values
and associated objects happens lazily as the traverser
("TreeTraversalEventConsumer" in my lingo) pulls events from the
sequence, similar to what some XML parsers do.

The main contract can be found at [1]. There are two event sequence
implements, one based on Hibernate's meta-model [2] and one for
java.util.Map [3]. An example event consumer implementation which
creates MongoDB documents can be found at [4].

As said I think it'd nicely fit for Hibernate Search, for HV I'm not
so sure. The reason being that the order of traversal may very,
depending on the defined validation groups and sequences. Sometimes we
need to go "depth first". I've been contemplating to employ an
event-like approach as described above for HV, but it may look
different than the one used for HSEARCH.

--Gunnar

[1] 
https://github.com/gunnarmorling/scenicview-mvp/blob/master/core/src/main/java/org/hibernate/scenicview/spi/backend/model/TreeTraversalSequence.java.
[2] 
https://github.com/gunnarmorling/scenicview-mvp/blob/master/core/src/main/java/org/hibernate/scenicview/internal/model/EntityStateBasedTreeTraversalSequence.java
[3] 
https://github.com/gunnarmorling/scenicview-mvp/blob/master/core/src/test/java/org/hibernate/scenicview/test/traversal/MapTreeTraversalSequence.java
[4] 
https://github.com/gunnarmorling/scenicview-mvp/blob/master/mongodb/src/main/java/org/hibernate/scenicview/mongodb/internal/MongoDbDenormalizationBackend.java#L91..L128



2017-02-06 16:49 GMT+01:00 Emmanuel Bernard :
> Your prototype is very Hibernate Search tainted. I wonder how or whether we 
> want it reusable across Hibernate Validator, Search and possibly more.
>
> Have you captured somewhere the discussion about the new document builder so 
> I could get a better grip of what’s at bay?
> Would this reverse of logic also be embraced by Hibernate Validator? There 
> are runtime decisions done in HV during traversal that made me doubt that it 
> would be as pertinent.
>
>
>
>> On 30 Jan 2017, at 11:21, Yoann Rodiere  wrote:
>>
>> Hi,
>>
>> Did the same this week-end, and adapted your work to match the bigger 
>> picture of what we discussed on Friday.
>> Basically the "StructureTraverser" is now called "ValueProcessor", because 
>> it's not responsible for exposing the internals of a structure anymore, but 
>> only to process a structure according to previously defined metadata, 
>> passing the output to the "DocumentContext". I think it's the second option 
>> you suggested. It makes sense in my opinion, since metadata will be defined 
>> differently for different source types (POJO, JSON, ...).
>> This design allows in particular what Sanne suggested: when bootstrapping, 
>> we can build some kind of "walker" (a composition of "ValueProcessors") from 
>> the metadata, and avoid metadata lookup at runtime.
>>
>> The snippet is there: 
>> https://gist.github.com/yrodiere/9ff8fe8a8c7f59c1a051b36db20fbd4d 
>> 
>>
>> I'm sure it'll have to be refined to address additional constraints, but in 
>> its current state it seems to address all of our requirements.
>>
>> Yoann Rodière mailto:yrodi...@redhat.com>>
>> Software Engineer
>> Red Hat / Hibernate NoORM Team
>>
>> On 27 January 2017 at 18:23, Emmanuel Bernard > > wrote:
>> I took the flight home to play with free form and specifically how we would 
>> retrieve data from the free form structure.
>> By free-form I mean non POJO but they will have schema (not expressed here).
>>
>> https://github.com/emmanuelbernard/hibernate-search/commit/0bd3fbab137bdad81bfa5b9934063792a050f537
>>  
>> 
>>
>> And in particular
>> https://github.com/emmanuelbernard/hibernate-search/blob/freeform/freeform/src/main/java/org/hibernate/freeform/StructureTraverser.java
>>  
>> 
>> https://github.com/emmanuelbernard/hibernate-search/blob/freeform/freeform/src/main/java/org/hibernate/freeform/pojo/impl/PojoStructureTraverser.java
>>  
>> 

Re: [hibernate-dev] [HV/HSEARCH] Free form

2017-02-07 Thread Yoann Rodiere
This conversation is starting to get a bit complex, so I'll try to organize
my answers:

# Applying the same solution to HV and HSearch

@Emmanuel: right, I didn't see you were also talking about HV. I was only
considering the HSearch case.

I think I agree with you both, HV and HSearch are a bit different and we
certainly cannot share the whole code.
Some principles could probably be shared, such as the abstraction over
accessing the input type with Emmanuel's "StructureTraverser".
But the traversal algorithms are probably very different. And in fact,
these traversals are at the core of each project's purpose, so it may not
be a good idea to try to make them "more similar".

# The requirements for HSearch

@Emmanuel: we didn't take much notes, but we did draw a diagram of the
target architecture:

https://drive.google.com/a/redhat.com/file/d/0B_z-zSf_hJiZam
JkZFBlNG5CeDQ/view?usp=sharing

When you shared your recordings/pictures, I asked for the write permission
on the shared folder to put the diagram, but you probably haven't had time
yet.

If I remember correctly, here were the main requirements:

   - Separate the source data traversal from the actual output format.
  - This will help when implementing different indexing services
  (Elasticsearch, Solr): we don't want to assume anything about the target
  format.
   - Make the implementation of JGroups/JMS as simple as possible.
  - In these case, we don't really want to build documents, we just
  want to transform the entity to a serializable object, and reduce the
  information to transmit over the network to a minimum.
  - Ideally, we'd just want to "record" the output of the traversal,
  transmit this recording to the master node, and leave the master node
  replay it to build a document. This would have the added benefit of not
  requiring any knowledge of the underlying technology (Lucene/ES/Solr) on
  the client side.
   - Requirements on the "mapping tree" (I'm not absolutely sure about
   those, Sanne may want to clarify):
  - “depth” and navigational graph to be pre-computed: tree of valid
  fields and options to be known in advance.
  - Immutable, threadsafe, easy to inspect/walk mapping tree
  - And on my end (I think Sanne shared this concern, but I may be
  wrong): query metadata as little as possible at runtime.

# More info on my snippet

@Gunnar: you asked for some client code, but I'm not sure it'll be very
explicit. The only client-facing interface (as far as document building
goes) is EntityDocumentConverter.
So, the parts of the application that need to convert an entity to a
document will do something like that:

EntityDocumentConverter converter = indexManager.getEntityDocument
Converter();
D document = converter.convert( entity );
indexManager.performOperation( newAddOperation( document ) );

The idea behind this was to make runtime code as simple as possible, and
move the complexity to the bootstrapping.
Basically, when you call converter.convert, it will delegate to
ValueProcessors, which will extract information from the entity and inject
it into the DocumentBuilder. What is extracted, and how to extract it, is
completely up to the ValueProcessor.
This means that, when bootstrapping, a tree of ValueProcessors will be
built according to the metadata. For instance when a @Field is encountered,
we build an appropriate ValueProcessor (potentially nesting multiple ones
if we want to keep matters separate: one for extracting the property's
value, one for transforming this value using a bridge). When an
@IndexedEmbedded is encountered, we build a different ValueProcessor. And
so on.
Here is an (admittedly very simple) example of what it'd look like in the
metadata processor;

  List collectedProcessors = new ArrayList<>();
  for ( XProperty property : properties ) {
Field fieldAnnotation = property.getAnnotation( Field.class );
if ( fieldAnnotation != null ) {
  ValueProcessor fieldBridgeProcessor = createFieldBridgeProcessor(
property.getType(), fieldAnnotation );
  ValueProcessor propertyProcessor = new JavaPropertyProcessor(
property, fieldBridgeProcessor ); // The value of the property will be
passed to the fieldBridgeProcessor at runtime
  collectedProcessor.add( propertyProcessor );
}
  }
  ValueProcessor rootProcessor = new CompositeProcessor(
collectedProcessors );
  return new EntityDocumentConverter( rootProcessor,
indexManagerType.getDocumentBuilder() )

The actual code will obviously be more complex, first because we need to
support much more features than just @Field, but also because the
createFieldBridgeProcessor() method needs to somehow build backend-specific
metadata based on the nature of the field. But I think the snippet captures
the spirit.

# Summary

Thinking about it a little, there's a different focus in our solutions.

   1. Emmanuel's solutions focuses on abstracting over the input data
   format (thanks to

Re: [hibernate-dev] [HV/HSEARCH] Free form

2017-02-08 Thread Emmanuel Bernard

> On 7 Feb 2017, at 11:17, Gunnar Morling  wrote:
> 
> Emmanuel,
> 
> In your PoC, how would a complete tree-like structure be traversed?
> It's not clear to me, who is driving StructureTraverser, i.e. which
> component will call processSubstructureInContainer() et al. when
> traversing an entire tree.

The metadata you have about which entity and property is indexed will drive the 
traversal. In case of HV, the metadata about which entity / property is 
constrained will.
The metadata is a separate model because:
1.  some structures like JSON have no real model
2. we still need additional information like @Field, @Valid etc as part of our 
metadata to drive navigation
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev