Re: XML - Pig UDF

2012-12-24 Thread Vitalii Tymchyshyn
I was doing such a thing in my previous project, but I did parse on demand. What I mean is that I've created set of xml-processing functions, each can take a string or Dom on input plus explicit parse function. I did this because I was usually using concatenation/grouping on parsed input files and

Re: XML - Pig UDF

2012-12-24 Thread Russell Jurney
Thanks - any chance of contributing some of that code? :) I have thought of a similar approach: starting with an XMLToPig EvalFunc that takes the output of the existing XMLLoader and converts it to tuple/bag/map form. Easier to baby step that, just a matter of plugging that code in to the xml

Re: Error without pig script line information

2012-12-24 Thread Russell Jurney
Please paste the script. Russell Jurney http://datasyndrome.com On Dec 24, 2012, at 12:37 AM, Haitao Yao yao.e...@gmail.com wrote: hi, all , here's an error that has no line numbers attached: ERROR 1200: org.apache.pig.newplan.logical.expression.ScalarExpression cannot be cast to

Re: Sequence File processing

2012-12-24 Thread Cheolsoo Park
Hi Srini, You can use STRSPLIT to split your value chararray and define schema in a FOREACH. For example, if the value consists of 3 integers (i.e. 1|2|3), A= LOAD 'part-m-' USING SequenceFileLoader() AS (key:long,value:chararray); B = FOREACH A GENERATE key, FLATTEN( STRSPLIT(value,'\\|') )

Re: Sequence File processing

2012-12-24 Thread Mohammad Tariq
+1 Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Tue, Dec 25, 2012 at 3:07 AM, Cheolsoo Park cheol...@cloudera.comwrote: Hi Srini, You can use STRSPLIT to split your value chararray and define schema in a FOREACH. For example, if the value consists of 3 integers (i.e.

Re: Limit number of Streaming Programs

2012-12-24 Thread Cheolsoo Park
Hi Thomas, If I understand your question correctly, what you want is reduce the number of mappers that spawn streaming processes. The default-parallel controls the number of reducers, so it won't have any effect to the number of mappers. Although the number of mappers is auto-determined by the

Re: Sequence File processing

2012-12-24 Thread Kshiva Kps
Hi, Is there any PIG editors and where we can write 100 to 150 pig scripts I'm believing is not possible to do in CLI mode . Like IDE for JAVA /TOAD for SQL pls advice , many thanks Thanks On Tue, Dec 25, 2012 at 3:09 AM, Mohammad Tariq donta...@gmail.com wrote: +1 Best Regards, Tariq

Re: Limit number of Streaming Programs

2012-12-24 Thread Mohammad Tariq
Folks on the list need some time mate. I have specified a couple of links on the other thread of yours. Check it out and see if it helps. Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Tue, Dec 25, 2012 at 11:09 AM, Kshiva Kps kshiva...@gmail.com wrote: Hi, Is there any PIG

Re: Sequence File processing

2012-12-24 Thread Srini
Thanks Cheolsoo. On Mon, Dec 24, 2012 at 1:37 PM, Cheolsoo Park cheol...@cloudera.comwrote: Hi Srini, You can use STRSPLIT to split your value chararray and define schema in a FOREACH. For example, if the value consists of 3 integers (i.e. 1|2|3), A= LOAD 'part-m-' USING

Reg: Real Time PIG scripts

2012-12-24 Thread Kshiva Kps
sorry to ask you if possible could you pls advice on below points In general in Real time how we will write PIG scripts. 1. PIG scripts alone in Eclipse 2. Java + PIG scripts 3 Or both possible / depends on requirement --If possible could you pls share one script which can be executed thro