Hi Talat, Please make sure you submit a proposal before the deadline. Deadline is March 27 16:00 UTC.
Regards Kevin On Fri, Mar 16, 2018 at 6:18 AM, lewis john mcgibbney <lewi...@apache.org> wrote: > Hi Talat, > In all honesty I don't have the same time I used to, to look into this. > I have been experimenting using Arrow with multi-dimensional array-based > data but nothing else. > I would therefore be learning probably as much as you if this project was > to go ahead. > Lewis > > On Thu, Mar 15, 2018 at 3:46 PM, Talat Uyarer <ta...@uyarer.com> wrote: > >> @Lewis I found a PR[0] on Arrow Git repo. I guess they stuck with avro-c >> library. Do you know do they need implement same thing for all languages >> which are supported by them or they just need to implement a wrapper ? >> >> If we can use Arrow for our internal serialization, Gora will be super >> fast with zero copy support. :) >> >> [0] https://github.com/apache/arrow/pull/1026 >> >> My 2 cent >> >> On Thu, Mar 15, 2018 at 12:24 AM, lewis john mcgibbney < >> lewi...@apache.org> wrote: >> >>> Hi Renato, >>> >>> On Wed, Mar 14, 2018 at 3:22 PM, Renato Marroquín Mogrovejo < >>> renatoj.marroq...@gmail.com> wrote: >>> >>>> Hey guys, >>>> >>>> There might not be an integration/convertors of Arrow to Avro (and/or >>>> viceversa) because there are parquet readers that can take avro and once >>>> stuff is in parquet, then arrow can be used directly. >>>> >>> >>> Yes there might not be. I actually raised this issue [0] a wee while ago >>> on the Arrow list. At that time I was told, "...The use case you outline >>> makes a lot of sense for Arrow to help out with. We don't yet have an AVRO >>> <> Arrow converter written but it is something that would be great to >>> have." So maybe that would be something to keep in mind. >>> >>> [0] https://s.apache.org/2GwS >>> >>> >>>> Regarding if an integration of Parquet with Gora, I think it would be >>>> interesting to make it easier for people to read and write parquet files by >>>> providing a higher level api as Gora provides. However, for you @Talat, >>>> that knows Gora pretty well, maybe you could take another project that >>>> helps Gora more. For example, fixing the integration with Nutch. There are >>>> multiple loose ends in Nutch 2.x and Gora that we have neglected as a >>>> community. >>>> IMHO that should be GSOC project. >>>> >>> >>> ACK, other existing projects which consume Gora are (off the top of my >>> head), >>> >>> - Chukwa - https://s.apache.org/cW6a >>> - Giraph - https://github.com/apache/giraph/tree/trunk/giraph-gora >>> - Camel - https://camel.apache.org/gora.html >>> - Nutch 2.X - https://github.com/apache/nutch/tree/2.x >>> >>> An interesting idea I had where Gora could be implemented would be in >>> Hadoop metrics >>> >>> https://hadoop.apache.org/docs/current/hadoop-project-dist/h >>> adoop-common/Metrics.html >>> >>> This would provide provide a text book usage for Gora to store Hadoop >>> metrics in some datastore which would then be exposed for query and >>> analysis. >>> >>>> I can't mentored it because I do not have enough insights on this, but >>>> @Lewis and @Talat you can probably tackle this as mentor and student. This >>>> would be an awesome contribution to the project as there are quite a lot of >>>> people going over Nutch and trying to use it with Gora. >>>> Just my 2c >>>> >>>> >>> Understood Renato, no biggie. Thanks for your input. I know you are >>> working with Parquet alot these days so your input is appreciated. >>> Lewis >>> >> >> >> >> -- >> Talat UYARER >> Websitesi: http://talat.uyarer.com >> Twitter: http://twitter.com/talatuyarer >> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304 >> > > > > -- > http://home.apache.org/~lewismc/ > http://people.apache.org/keys/committer/lewismc >