Excellent.
Apologies for being absent. I am undergoing a job transition and it has
been very busy.
I suggest that we start a weekly tagup as well.
Lewis

On Sun, Jun 2, 2019 at 1:14 PM Sheriffo Ceesay <sneceesa...@gmail.com>
wrote:

> The code so far is available at the GitHub link below.
>
> https://github.com/sneceesay77/gora/tree/GORA-532/gora-benchmark
>
>
>
> **Sheriffo Ceesay**
>
>
> On Sun, Jun 2, 2019 at 8:34 PM Sheriffo Ceesay <sneceesa...@gmail.com>
> wrote:
>
>> Hi Renato,
>>
>> Thanks for the detailed reply. I agree with your recommendations on the
>> way forward. I will go ahead and implement the rest of the functionality
>> using reflection and we can follow your recommendations on the next
>> iterations.
>>
>> As for the backend, I am using both HBase and MongoDB and all seems well
>> at the moment.
>>
>> I will let you all know why I push my code to GitHub.
>>
>> Thank you.
>>
>>
>> **Sheriffo Ceesay**
>>
>>
>> On Sun, Jun 2, 2019 at 7:01 PM Renato Marroquín Mogrovejo <
>> renatoj.marroq...@gmail.com> wrote:
>>
>>> Hi Sheriffo,
>>>
>>> Some opinions about your questions, but others are more than welcome
>>> to suggest other things as well.
>>>
>>> Q1: Are we going to consider arbitrary field length, e.g. if we set
>>> the fieldcount to 100 then we have to create the respective Avro and
>>> mapping files? Currently,
>>> I don't think this process is automated and may be tedious for large
>>> field counts.
>>> I think for the first code iteration, we should use whatever
>>> fieldcount you have generated for. Ideally, we should be able to
>>> invoke the Gora bean generator and generate as many fields as required
>>> by the benchmark configuration.
>>>
>>> Q2: Second: The second problem has to do with the first one, if we
>>> allow arbitrary field counts, then there has to be a mechanism to call
>>> each of the set or get methods during CRUD operations. So to avoid
>>> this I used Java Reflection. See the sample code below.
>>> We have some options to deal with having arbitrarily number of fields.
>>> 1) Use reflection as you have which might be ok for the first code
>>> iteration, but if we want to have some decent performance against
>>> using datastores natively (no Gora), we should go away from it.
>>> 2) Do Gora class generation (and also generate the method used to
>>> insert data through Gora) in a step before the benchmark starts.
>>> Something like this:
>>> # passing config parameters to generate Gora Beans with number of
>>> required fields
>>> # this should output the generate class and the method that does the
>>> insertion
>>> $ gora_compiler.sh --benchmark --fields_required 4
>>> The output path containing the result of this should be then include
>>> (or passed) as runtime dependency to the benchmark class.
>>> 3) Because Gora uses Avro, we can use complex data types, e.g.,
>>> arrays, maps. So we could represent number of fields as number of
>>> elements inside an array. I would think that this option gives us the
>>> best performance.
>>> I think  we should continue with option (1) until we have the entire
>>> pipeline working, and we understand how every piece fits together with
>>> each other (YSCB, Gora, Gora compiler, benchmark setup steps). Then we
>>> should do (2) which is the most general and the one that reflects how
>>> people usually use Gora, and then we test with (3). I think all of
>>> these steps are totally doable in our time frame as we build upon
>>> previous steps.
>>> The other thing that we should decide is which backend to use as there
>>> are backends that are more mature than others. I'd say to use the
>>> HBase backend as it is the most stable one and the one with more
>>> features, and if we feel brave we can try other backends (and fix them
>>> if necessary!)
>>>
>>>
>>> Best,
>>>
>>> Renato M>
>>>
>>> El dom., 2 jun. 2019 a las 19:10, Sheriffo Ceesay
>>> (<sneceesa...@gmail.com>) escribió:
>>> >
>>> > Dear Mentors,
>>> >
>>> > My week one report is available at
>>> >
>>> https://cwiki.apache.org/confluence/display/GORA/%5BGORA-532%5D+Apache+Gora+Benchmark+Module+Weekly+Report
>>> >
>>> > I have also included a detailed question of and I will need your
>>> guidance
>>> > on that.
>>> >
>>> > Please let me know what your thoughts are.
>>> >
>>> > Thank you.
>>> >
>>> > **Sheriffo Ceesay**
>>>
>>

-- 
http://home.apache.org/~lewismc/
http://people.apache.org/keys/committer/lewismc

Reply via email to