[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Target Version/s:   (was: 2.4.6)

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> Screen_Shot_2020-08-05_at_2.46.42_PM.png, image-2020-08-03-07-03-55-716.png, 
> unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Fix Version/s: (was: 2.4.6)

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> Screen_Shot_2020-08-05_at_2.46.42_PM.png, image-2020-08-03-07-03-55-716.png, 
> unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-05 Thread Jayce Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayce Jiang updated SPARK-32515:

Attachment: Screen_Shot_2020-08-05_at_2.46.42_PM.png

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> Screen_Shot_2020-08-05_at_2.46.42_PM.png, image-2020-08-03-07-03-55-716.png, 
> unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Component/s: (was: PySpark)
 SQL

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> image-2020-08-03-07-03-55-716.png, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Target Version/s:   (was: 2.4.6)

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> image-2020-08-03-07-03-55-716.png, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Labels:   (was: distinct groupby load read)

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> image-2020-08-03-07-03-55-716.png, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Fix Version/s: (was: 2.4.6)

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
>  Labels: distinct, groupby, load, read
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> image-2020-08-03-07-03-55-716.png, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-32515:
-
Priority: Major  (was: Blocker)

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Major
>  Labels: distinct, groupby, load, read
> Fix For: 2.4.6
>
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> image-2020-08-03-07-03-55-716.png, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread JinxinTang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JinxinTang updated SPARK-32515:
---
Attachment: image-2020-08-03-07-03-55-716.png

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Blocker
>  Labels: distinct, groupby, load, read
> Fix For: 2.4.6
>
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, 
> image-2020-08-03-07-03-55-716.png, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Jayce Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayce Jiang updated SPARK-32515:

Attachment: Capture1.png

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Blocker
>  Labels: distinct, groupby, load, read
> Fix For: 2.4.6
>
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, unknown.png, 
> unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Jayce Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayce Jiang updated SPARK-32515:

Attachment: Capture2.PNG

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Blocker
>  Labels: distinct, groupby, load, read
> Fix For: 2.4.6
>
> Attachments: Capture.PNG, Capture1.png, Capture2.PNG, unknown.png, 
> unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Jayce Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayce Jiang updated SPARK-32515:

Attachment: Capture.PNG

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Blocker
>  Labels: distinct, groupby, load, read
> Fix For: 2.4.6
>
> Attachments: Capture.PNG, unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-02 Thread Jayce Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayce Jiang updated SPARK-32515:

Attachment: unknown2.png
unknown1.png
unknown.png

> Distinct Function Weird Bug
> ---
>
> Key: SPARK-32515
> URL: https://issues.apache.org/jira/browse/SPARK-32515
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.6
> Environment: Window 10 and Mac, both have the same issues.
> Using Scala version 2.11.12
> Python 3.6.10
> java version "1.8.0_261"
>Reporter: Jayce Jiang
>Priority: Blocker
>  Labels: distinct, groupby, load, read
> Fix For: 2.4.6
>
> Attachments: unknown.png, unknown1.png, unknown2.png
>
>
> A weird spark display and counting error. When I was loading in my CSV file 
> into spark and trying to do check all distinct value from a column inside of 
> a dataframe. Everything I try in spark resulted in a wrong answer. But if I 
> convert my spark dataframe into pandas dataframe, it works. Please help. This 
> bug only happens in this one CSV file, all my other CSV files work properly. 
> Here are the pictures.
>  
> !image-2020-08-01-21-19-06-402.png!!image-2020-08-01-21-19-03-289.png!!image-2020-08-01-21-18-58-625.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org