)
using java
Can any one suggest..
Note: i need to use other than \n bez my data contains \n as part of the
column value.
Thanks & Regards
Radha krishna
o hdfs with the same line separator
(RS[\u001e])
Thanks & Regards
Radha krishna
Hi All,
Please check below for the code and input and output, i think the output is
not correct, i am missing any thing? pls guide
Code
public class Test {
private static JavaSparkContext jsc = null;
private static SQLContext sqlContext = null;
private static Configuration hadoopConf = null;
>>> DataFrame empDF =
>>>> sqlContext.createDataFrame(empRDD, Emp.class);
>>>> empDF.registerTempTable("EMP");
>>>>
>>>> sqlContext.sql("SELECT * FROM EMP e LEFT OUTER JOIN
>>>> DEPT d ON e.deptid
>>>> = d.deptid").show();
>>>>
>>>>
>>>>
>>>> //empDF.join(deptDF,empDF.col("deptid").equalTo(deptDF.col("deptid")),"leftouter").show();;
>>>>
>>>> }
>>>> catch(Exception e){
>>>> System.out.println(e);
>>>> }
>>>> }
>>>> public static Emp getInstance(String[] parts, Emp emp) throws
>>>> ParseException {
>>>> emp.setId(parts[0]);
>>>> emp.setName(parts[1]);
>>>> emp.setDeptid(parts[2]);
>>>>
>>>> return emp;
>>>> }
>>>> public static Dept getInstanceDept(String[] parts, Dept dept)
>>>> throws
>>>> ParseException {
>>>> dept.setDeptid(parts[0]);
>>>> dept.setDeptname(parts[1]);
>>>> return dept;
>>>> }
>>>> }
>>>>
>>>> Input
>>>> Emp
>>>> 1001 aba 10
>>>> 1002 abs 20
>>>> 1003 abd 10
>>>> 1004 abf 30
>>>> 1005 abg 10
>>>> 1006 abh 20
>>>> 1007 abj 10
>>>> 1008 abk 30
>>>> 1009 abl 20
>>>> 1010 abq 10
>>>>
>>>> Dept
>>>> 10 dev
>>>> 20 Test
>>>> 30 IT
>>>>
>>>> Output
>>>> +--+--++--++
>>>> |deptid|id|name|deptid|deptname|
>>>> +--+--++--++
>>>> |10| 1001| aba|10| dev|
>>>> |10| 1003| abd|10| dev|
>>>> |10| 1005| abg|10| dev|
>>>> |10| 1007| abj|10| dev|
>>>> |10| 1010| abq|10| dev|
>>>> |20| 1002| abs| null|null|
>>>> |20| 1006| abh| null|null|
>>>> |20| 1009| abl| null|null|
>>>> |30| 1004| abf| null|null|
>>>> |30| 1008| abk| null|null|
>>>> +--+--++--++
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Left-outer-Join-issue-using-programmatic-sql-joins-tp27295.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> -
>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>
>
--
Thanks & Regards
Radha krishna
Hi Mich,
Here I given just a sample data,
I have some GB's of files in HDFS and performing left outer joins on those
files, and the final result I am going to store in Vertica data base table.
There is no duplicate columns in the target table but for the non matching
rows columns I want to insert
AS|
| 16| |
| 13| UK|
| 14| US|
| 20| As|
| 15| IN|
| 19| IR|
| 11| PK|
+---++
i am expecting the below one any idea, how to apply IS NOT NULL ?
+---++
|_c0|code|
+---++
| 18| AS|
| 13| UK|
| 14| US|
| 20| As|
| 15| IN|
| 19| IR|
| 11| PK|
+---++
Thanks & Regards
Radha krishna
Ok thank you, how to achieve the requirement.
On Sun, Jul 10, 2016 at 8:44 PM, Sean Owen <so...@cloudera.com> wrote:
> It doesn't look like you have a NULL field, You have a string-value
> field with an empty string.
>
> On Sun, Jul 10, 2016 at 3:19 PM, Radha krishna <grkm
I want to apply null comparison to a column in sqlcontext.sql, is there any
way to achieve this?
On Jul 10, 2016 8:55 PM, "Radha krishna" <grkmc...@gmail.com> wrote:
> Ok thank you, how to achieve the requirement.
>
> On Sun, Jul 10, 2016 at 8:44 PM, Sean Owen &