Re: [DISCUSS] Project Road Map

2016-04-28 Thread DO YUNG YOON
I am also fan of Neo4J’s web interface and I agree with JongWook's
opinion, think about the UI from the scratch.

I thought simply visualize graph structure could be done with simple
javascript work(maybe d3.js or something) since s2graph query has
"returnTree" feature that return entire tree that query traversed on(in bfs
search manner).

I think most of folks had chance to take a look at what we are heading to
in big picture(give any feedback if anything is not clear).

I want to start discuss what we will prioritize on above list and more
importantly, what will be our focus for our first release.
Here is my opinion.

Low Latency, High Throughput, Serialize Schema, BFS Query, Documentations,
Project Homepage.

Above parts are very basic for storing/select data as Edge/Vertex and
traverse on them which I think what is most important features.
What do you guys think?



On Tue, Apr 12, 2016 at 11:27 PM Alexander Bezzubov  wrote:

> Sure, It's totally up to you guys!
>
> I was suggesting a way to save some development efforts only on a query UI.
>
> Somehow having "best selling point" for the graph storage system being an
> embedded webapp sounds a bit, well, surprising to me :) But that is just a
> feedback, no back-seat driving here.
>
> I can imagine though having handy tool to visualize graph structure can be
> very useful i.e for debugging.
>
> Great to see such an ambitious roadmap for the project!
>
> --
> Alex
>
>
>
> On Tue, Apr 12, 2016 at 7:39 PM, Jong Wook Kim  wrote:
>
> > I know that Zeppelin is very good at interactively plotting something out
> > of dataframes, like pie charts, histograms, line graphs, etc.
> >
> > But I'm not quite sure if it is any easier to visualize graph structures,
> > than starting from the scratch.
> >
> > I have been very pleasant with Neo4J’s web interface which is embedded in
> > its server. Using Zeppelin as the primary visualizer might overcomplicate
> > things, as it will require configuring the whole Zeppelin distribution
> as a
> > subproject of S2Graph. There would also be a lot more JVM processes to
> > manage - one for the Zeppelin server and one interpreter process for
> every
> > notebook.
> >
> > I’m not trying to be NIH or anything, but looking at Neo4J, Apache spark
> > and RethinkDB’s embedded web UI, I think it will be a nicer to think
> about
> > the UI from the scratch - it could be the best selling point of s2graph.
> >
> >
> > Best,
> > Jong Wook
> >
> >
> > > On Apr 12, 2016, at 6:08 AM, Alexander Bezzubov 
> wrote:
> > >
> > > Sounds as a great plan to me as well, thank you for sharing details,
> > please
> > > keep it up and keep the mailing list posted!
> > >
> > > As for "Query Graphical User Interface" I would suggest trying Zeppelin
> > out
> > > and just providing a an interpreter implementation [1] for the query
> > > language you choose as it's very simple and nice GUI comes for free.
> > >
> > > 1. http://zeppelin.incubator.apache
> > >
> >
> .org/docs/0.6.0-incubating-SNAPSHOT/development/writingzeppelininterpreter
> > > .html
> > >
> > > --
> > > Alex
> > >
> > >
> > > On Tue, Apr 5, 2016 at 11:30 PM, Luke Han  wrote:
> > >
> > >> Hi Doyung and Jo,
> > >>Actually, I have no concern about supporting more storages rather
> > than
> > >> HBase. Refactoring existing design to support more engines will make
> > >> project more suitable for different usage.
> > >>
> > >>But the question here is the community does not know why, until you
> > >> guys started to discuss in mailing list and reply above. Please keep
> > moving
> > >> on  and bring more discussion in mailing list.
> > >>
> > >>Thanks.
> > >> Luke
> > >>
> > >>
> > >>
> > >> Best Regards!
> > >> -
> > >>
> > >> Luke Han
> > >>
> > >> On Fri, Apr 1, 2016 at 1:52 PM, Hyunsung Jo 
> > wrote:
> > >>
> > >>> Doyoung,
> > >>>
> > >>> Thank you for sharing the document!
> > >>>
> > >>>
> > >>> Luke and Alexander,
> > >>>
> > >>> Do you have any concerns regarding supporting multiple storage
> engines?
> > >>>
> > >>> As far as I understand, although S2Graph began exclusively on top of
> > >> HBase,
> > >>> it always had other storage engines in mind.
> > >>> Perhaps this is somewhat unclear in the proposal, but I see hits of
> the
> > >>> plan for additional storages in statements such as -
> > >>> S2Graph  provides a
> > scalable
> > >>> distributed graph database engine over *a key/value store such as
> > HBase*.
> > >>> This is also why some of the earliest JIRA tickets (S2GRAPH-1, 51)
> > cover
> > >>> this topic. (Now that I think of it, we should have had this
> discussion
> > >>> prior to opening the tickets, but better late than never!)
> > >>> Thanks to the recent refactoring (S2GRAPH-17) as Doyoung mentioned, I
> > >> think
> > >>> the latest storage-related code is abstract + general enough to try
> out
> > 

Separate build rowKey, qualifier, value for KeyValue in Serializable.

2016-04-28 Thread DO YUNG YOON
When user query is provided, we are build RPC request(GetRequest, Scanner)
to fetch list of KeyValue from storage.
For query, I think mostly we only need bytes for rowKey(GetRequest), but
currently building rowKey, qualifier, value on Serializable is not
separated so we have to serialize qualifier and value even though we don't
need it.
I am suggesting to separate building rowKey, qualifier, value in
Serializable so for query, we can skip building qualifier, value which is
unnecessary.


Here is what I have found from profiling through jvisualvm.
Note that buildRequest function takes more resource than fetchInner(actual
I/O request to storage). I think this should be fixed. what do you guys
think?

[image: buildRequest.png]