On Mon, Jul 23, 2012 at 6:41 PM, Tharindu Mathew <thari...@wso2.com> wrote:
> If you are planning to do a few MB, that would mean that the size of logs > will be ( size of logs * no. of tenants ), so roughly for 200 active > tenants and 2 MB of logs, it would come to around 400 MB. This is still > manageable in a custom task if your data processing is low. > > On Mon, Jul 23, 2012 at 6:24 PM, Afkham Azeez <az...@wso2.com> wrote: > >> Like you said, the task may not be the best way to do this. Like we >> discussed the other day, we can publish logs to unique column families >> which contain the <Service>_<Tenant>_<Date> as the unique identifier. We >> need to generate logs in a file format & allow tenant users to download >> those. What is the best approach to generate these log files from the data >> collected? Typically, such a log file can run into a few MB. > > I'm a bit confused as we did not need to use Hive as per our earlier > conversation. This is because as the data is published it is already > grouped by server/ tenant and date. > Yeah, there is no analytics to be done. It is a problem of converting data stored in Cassandra into a flat file. > >> Azeez >> >> >> On Mon, Jul 23, 2012 at 6:18 PM, Tharindu Mathew <thari...@wso2.com>wrote: >> >>> I'm no expert, but I immediately question the scale of this approach. >>> >>> Do you have an idea of how much of logs you plan to process per task? >>> >>> >>> On Mon, Jul 23, 2012 at 6:13 PM, Afkham Azeez <az...@wso2.com> wrote: >>> >>>> The requirement is simple. We need to generate log files on a per >>>> tenant, per date, per Service basis. Now as a big data & analytics expert, >>>> please advise us on what is the best solution for this. >>>> >>>> Azeez >>>> >>>> >>>> On Mon, Jul 23, 2012 at 6:05 PM, Tharindu Mathew <thari...@wso2.com>wrote: >>>> >>>>> So through this custom java task, what is the scale of log processing >>>>> you will support? 100MB, 1 GB, 100 GB, 1 TB? >>>>> >>>>> On Mon, Jul 23, 2012 at 5:14 PM, Manisha Gayathri <mani...@wso2.com>wrote: >>>>> >>>>>> Contacted Hive User Group as well on this matter. >>>>>> They also mentioned that this approach is not possible. >>>>>> Also as per the chat I had with Buddhika, right now, these kind of >>>>>> dynamic variable creations is not possible in Hive that comes with BAM2. >>>>>> >>>>>> Therefore IMO, without going ahead with this cumbersome process, the >>>>>> best way will be to run a scheduled java task to pick data from relevant >>>>>> Cassandra Column families and dynamically generate the relevant log files >>>>>> (according to the tenantID and current date) which will be stored in >>>>>> Apache >>>>>> Directory. >>>>>> >>>>> You are going to store the results in a LDAP? >>>>> >>>>>> >>>>>> As per the offline chat had with Azeez, will start to work on a >>>>>> custom Java task that can handle the above scenario. >>>>>> >>>>>> On Mon, Jul 23, 2012 at 2:27 PM, Manisha Gayathri >>>>>> <mani...@wso2.com>wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> For a log file storing scenario using BAM2, I have a requirement to >>>>>>> generate separate log files for each date. For that I have created a >>>>>>> Hive >>>>>>> Analytic query along with a Hive UDF as well. >>>>>>> >>>>>>> I have the getFilePath function which should return a URL like this. >>>>>>> >>>>>>> home/user/Desktop/logDir/logs/log_0_testServer_2012_07_22 >>>>>>> >>>>>>> The defined function works perfectly if I put *getFilePath( >>>>>>> "0","testServer" ) *into the *select* statement. >>>>>>> >>>>>>> But I want to get that particular URL as the *local directory name*. >>>>>>> (The requirement is such that this should not be hard-coded in the hive >>>>>>> query. Rather should be generated in the custom UDF. ) >>>>>>> >>>>>>> So can I do something like I v shown below? >>>>>>> >>>>>>> *set file_name= getFilePath( "0","testServer" ); *//Define a >>>>>>> parameter.* * >>>>>>> *.................* >>>>>>> *..............* >>>>>>> *INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_name}' >>>>>>> *//Assign the above parameter as the file URL >>>>>>> >>>>>>> I tried this way. But the directory name is returned as >>>>>>> >>>>>>> file:/getFilePath( "0" , "testServer" ) >>>>>>> >>>>>>> Does that mean I cannot use UDF to define the local directory name? >>>>>>> Or am I doing anything wrong in here? >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> ~Regards >>>>>>> *Manisha Eleperuma* >>>>>>> Software Engineer >>>>>>> WSO2, Inc.: http://wso2.com >>>>>>> lean.enterprise.middleware >>>>>>> >>>>>>> * >>>>>>> * >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> ~Regards >>>>>> *Manisha Eleperuma* >>>>>> Software Engineer >>>>>> WSO2, Inc.: http://wso2.com >>>>>> lean.enterprise.middleware >>>>>> >>>>>> * >>>>>> * >>>>>> * >>>>>> * >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Dev mailing list >>>>>> Dev@wso2.org >>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> >>>>> Tharindu >>>>> >>>>> blog: http://mackiemathew.com/ >>>>> M: +94777759908 >>>>> >>>>> >>>>> _______________________________________________ >>>>> Dev mailing list >>>>> Dev@wso2.org >>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Afkham Azeez* >>>> Director of Architecture; WSO2, Inc.; http://wso2.com >>>> Member; Apache Software Foundation; http://www.apache.org/ >>>> * <http://www.apache.org/>** >>>> email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919 >>>> blog: **http://blog.afkham.org* <http://blog.afkham.org>* >>>> twitter: >>>> **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> >>>> * >>>> linked-in: **http://lk.linkedin.com/in/afkhamazeez* >>>> * >>>> * >>>> *Lean . Enterprise . Middleware* >>>> >>>> >>> >>> >>> -- >>> Regards, >>> >>> Tharindu >>> >>> blog: http://mackiemathew.com/ >>> M: +94777759908 >>> >>> >> >> >> -- >> *Afkham Azeez* >> Director of Architecture; WSO2, Inc.; http://wso2.com >> Member; Apache Software Foundation; http://www.apache.org/ >> * <http://www.apache.org/>** >> email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919 >> blog: **http://blog.afkham.org* <http://blog.afkham.org>* >> twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> >> * >> linked-in: **http://lk.linkedin.com/in/afkhamazeez* >> * >> * >> *Lean . Enterprise . Middleware* >> >> > > > -- > Regards, > > Tharindu > > blog: http://mackiemathew.com/ > M: +94777759908 > > -- *Afkham Azeez* Director of Architecture; WSO2, Inc.; http://wso2.com Member; Apache Software Foundation; http://www.apache.org/ * <http://www.apache.org/>** email: **az...@wso2.com* <az...@wso2.com>* cell: +94 77 3320919 blog: **http://blog.afkham.org* <http://blog.afkham.org>* twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> * linked-in: **http://lk.linkedin.com/in/afkhamazeez* * * *Lean . Enterprise . Middleware*
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev