On Mon, Jul 23, 2012 at 6:41 PM, Tharindu Mathew <thari...@wso2.com> wrote:

> If you are planning to do a few MB, that would mean that the size of logs
> will be ( size of logs * no. of tenants ), so roughly for 200 active
> tenants and 2 MB of logs, it would come to around 400 MB. This is still
> manageable in a custom task if your data processing is low.
>
> On Mon, Jul 23, 2012 at 6:24 PM, Afkham Azeez <az...@wso2.com> wrote:
>
>> Like you said, the task may not be the best way to do this. Like we
>> discussed the other day, we can publish logs to unique column families
>> which contain the <Service>_<Tenant>_<Date> as the unique identifier. We
>> need to generate logs in a file format & allow tenant users to download
>> those. What is the best approach to generate these log files from the data
>> collected? Typically, such a log file can run into a few MB.
>
> I'm a bit confused as we did not need to use Hive as per our earlier
> conversation. This is because as the data is published it is already
> grouped by server/ tenant and date.
>

Yeah, there is no analytics to be done. It is a problem of converting data
stored in Cassandra into a flat file.


>
>> Azeez
>>
>>
>> On Mon, Jul 23, 2012 at 6:18 PM, Tharindu Mathew <thari...@wso2.com>wrote:
>>
>>> I'm no expert, but I immediately question the scale of this approach.
>>>
>>> Do you have an idea of how much of logs you plan to process per task?
>>>
>>>
>>> On Mon, Jul 23, 2012 at 6:13 PM, Afkham Azeez <az...@wso2.com> wrote:
>>>
>>>> The requirement is simple. We need to generate log files on a per
>>>> tenant, per date, per Service basis. Now as a big data & analytics expert,
>>>> please advise us on what is the best solution for this.
>>>>
>>>> Azeez
>>>>
>>>>
>>>> On Mon, Jul 23, 2012 at 6:05 PM, Tharindu Mathew <thari...@wso2.com>wrote:
>>>>
>>>>> So through this custom java task, what is the scale of log processing
>>>>> you will support? 100MB, 1 GB, 100 GB, 1 TB?
>>>>>
>>>>> On Mon, Jul 23, 2012 at 5:14 PM, Manisha Gayathri <mani...@wso2.com>wrote:
>>>>>
>>>>>> Contacted Hive User Group as well on this matter.
>>>>>> They also mentioned that this approach is not possible.
>>>>>> Also as per the chat I had with Buddhika, right now, these kind of
>>>>>> dynamic variable creations is not possible in Hive that comes with BAM2.
>>>>>>
>>>>>> Therefore IMO, without going ahead with this cumbersome process, the
>>>>>> best way will be to run a scheduled java task to pick data from relevant
>>>>>> Cassandra Column families and dynamically generate the relevant log files
>>>>>> (according to the tenantID and current date) which will be stored in 
>>>>>> Apache
>>>>>> Directory.
>>>>>>
>>>>> You are going to store the results in a LDAP?
>>>>>
>>>>>>
>>>>>> As per the offline chat had with Azeez, will start to work on a
>>>>>> custom Java task that can handle the above scenario.
>>>>>>
>>>>>> On Mon, Jul 23, 2012 at 2:27 PM, Manisha Gayathri 
>>>>>> <mani...@wso2.com>wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> For a log file storing scenario using BAM2, I have a requirement to
>>>>>>> generate separate log files for each date. For that I have created a 
>>>>>>> Hive
>>>>>>> Analytic query along with a Hive UDF as well.
>>>>>>>
>>>>>>> I have the getFilePath function which should return a URL like this.
>>>>>>>
>>>>>>> home/user/Desktop/logDir/logs/log_0_testServer_2012_07_22
>>>>>>>
>>>>>>> The defined function works perfectly if I put *getFilePath(
>>>>>>> "0","testServer" ) *into the *select* statement.
>>>>>>>
>>>>>>> But I want to get that particular URL as the *local directory name*.
>>>>>>> (The requirement is such that this should not be hard-coded in the hive
>>>>>>> query. Rather should be generated in the custom UDF. )
>>>>>>>
>>>>>>> So can I do something like I v shown below?
>>>>>>>
>>>>>>> *set file_name= getFilePath( "0","testServer" );    *//Define a
>>>>>>> parameter.* *
>>>>>>> *.................*
>>>>>>> *..............*
>>>>>>> *INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_name}'
>>>>>>>                  *//Assign the above parameter as the file URL
>>>>>>>
>>>>>>> I tried this way. But the directory name is returned as
>>>>>>>
>>>>>>> file:/getFilePath( "0" , "testServer" )
>>>>>>>
>>>>>>> Does that mean I cannot use UDF to define the local directory name?
>>>>>>> Or am I doing anything wrong in here?
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ~Regards
>>>>>>> *Manisha Eleperuma*
>>>>>>> Software Engineer
>>>>>>> WSO2, Inc.: http://wso2.com
>>>>>>> lean.enterprise.middleware
>>>>>>>
>>>>>>> *
>>>>>>> *
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ~Regards
>>>>>> *Manisha Eleperuma*
>>>>>> Software Engineer
>>>>>> WSO2, Inc.: http://wso2.com
>>>>>> lean.enterprise.middleware
>>>>>>
>>>>>> *
>>>>>> *
>>>>>> *
>>>>>> *
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing list
>>>>>> Dev@wso2.org
>>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> Tharindu
>>>>>
>>>>> blog: http://mackiemathew.com/
>>>>> M: +94777759908
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing list
>>>>> Dev@wso2.org
>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Afkham Azeez*
>>>> Director of Architecture; WSO2, Inc.; http://wso2.com
>>>> Member; Apache Software Foundation; http://www.apache.org/
>>>> * <http://www.apache.org/>**
>>>> email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919
>>>> blog: **http://blog.afkham.org* <http://blog.afkham.org>*
>>>> twitter: 
>>>> **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez>
>>>> *
>>>> linked-in: **http://lk.linkedin.com/in/afkhamazeez*
>>>> *
>>>> *
>>>> *Lean . Enterprise . Middleware*
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> Tharindu
>>>
>>> blog: http://mackiemathew.com/
>>> M: +94777759908
>>>
>>>
>>
>>
>> --
>> *Afkham Azeez*
>> Director of Architecture; WSO2, Inc.; http://wso2.com
>> Member; Apache Software Foundation; http://www.apache.org/
>> * <http://www.apache.org/>**
>> email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919
>> blog: **http://blog.afkham.org* <http://blog.afkham.org>*
>> twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez>
>> *
>> linked-in: **http://lk.linkedin.com/in/afkhamazeez*
>> *
>> *
>> *Lean . Enterprise . Middleware*
>>
>>
>
>
> --
> Regards,
>
> Tharindu
>
> blog: http://mackiemathew.com/
> M: +94777759908
>
>


-- 
*Afkham Azeez*
Director of Architecture; WSO2, Inc.; http://wso2.com
Member; Apache Software Foundation; http://www.apache.org/
* <http://www.apache.org/>**
email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919
blog: **http://blog.afkham.org* <http://blog.afkham.org>*
twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez>
*
linked-in: **http://lk.linkedin.com/in/afkhamazeez*
*
*
*Lean . Enterprise . Middleware*
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to