RE: Hive REGEXP_REPLACE use or equivalent in Spark

2016-02-19 Thread Mich Talebzadeh
deh <m...@peridale.co.uk> Cc: user @spark <user@spark.apache.org> Subject: Re: Hive REGEXP_REPLACE use or equivalent in Spark You might be better off using the CSV loader in this case. https://github.com/databricks/spark-csv Input: [csingh ~]$ hadoop fs -cat test.csv 360,10/

Re: Hive REGEXP_REPLACE use or equivalent in Spark

2016-02-19 Thread Chandeep Singh
ology Ltd, its > subsidiaries or their employees, unless expressly so stated. It is the > responsibility of the recipient to ensure that this email is virus free, > therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibili

Re: Hive REGEXP_REPLACE use or equivalent in Spark

2016-02-19 Thread UMESH CHAUDHARY
expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Technology Ltd, its subsidiaries nor their > employees accept any responsibility. > > > > > > *From:* Andrew Ehrlich [mailto:and...@aehrlich.

RE: Hive REGEXP_REPLACE use or equivalent in Spark

2016-02-19 Thread Mich Talebzadeh
t: 19 February 2016 01:22 To: Mich Talebzadeh <m...@peridale.co.uk> Cc: User <user@spark.apache.org> Subject: Re: Hive REGEXP_REPLACE use or equivalent in Spark Use the scala method .split(",") to split the string into a collection of strings, and try using .replac

Re: Hive REGEXP_REPLACE use or equivalent in Spark

2016-02-18 Thread Andrew Ehrlich
Use the scala method .split(",") to split the string into a collection of strings, and try using .replaceAll() on the field with the "?" to remove it. On Thu, Feb 18, 2016 at 2:09 PM, Mich Talebzadeh wrote: > Hi, > > What is the equivalent of this Hive statement in Spark >

Hive REGEXP_REPLACE use or equivalent in Spark

2016-02-18 Thread Mich Talebzadeh
Hi, What is the equivalent of this Hive statement in Spark select "?2,500.00", REGEXP_REPLACE("?2,500.00",'[^\\d\\.]',''); ++--+--+ |_c0 | _c1| ++--+--+ | ?2,500.00 | 2500.00 | ++--+--+ Basically I want to get rid of