[ https://issues.apache.org/jira/browse/SPARK-19971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom Tang updated SPARK-19971: ----------------------------- Description: Let say we have a csv /tmp/1.csv : {code} cid,name -100224910923912596,jack -100224910923912595,tom -1,rose -2,marry -100,rose1 -101,rose2 {code} Use following SQL to define a view in Spark-SQL: {code} CREATE TEMPORARY VIEW T ( `cid` string, `name` string ) USING CSV OPTIONS ( path "/tmp/1.csv" ); {code} Statement 1: {code}select * from T where cid = -100224910923912596; {code} Returns: {code} -100224910923912596 jack -100224910923912595 tom {code} Statement 2: {code}select * from T where cid = -100224910923912599;{code} it also returns: {code} -100224910923912596 jack -100224910923912595 tom {code} Unless you do, {code}select * from T where cid = '-100224910923912596';{code} It returns: {code} -100224910923912596 jack {code} However, i think the expected behaviour for statement 1 and 2 is pretty wired. Statement 4 {quote}select * from T where cid = -100;{quote} Returns: {quote}-100 rose1{quote} And this just affect the large number, the smaller one seemed to be good. Does that look like a bug to you folks ? Thanks. was: Let say we have a csv /tmp/1.csv : {quote} cid,name -100224910923912596,jack -100224910923912595,tom -1,rose -2,marry -100,rose1 -101,rose2 {quote} Use following SQL to define a view in Spark-SQL: CREATE TEMPORARY VIEW T ( `cid` string, `name` string ) USING CSV OPTIONS ( path "/tmp/1.csv" ); Statement 1: {quote}select * from T where cid = -100224910923912596; {quote} Returns: {quote} -100224910923912596 jack -100224910923912595 tom {quote} Statement 2: {quote}select * from T where cid = -100224910923912599;{quote} it also returns: {quote} -100224910923912596 jack -100224910923912595 tom {quote} Unless you do, {quote}select * from T where cid = '-100224910923912596';{quote} It returns: {quote} -100224910923912596 jack {quote} However, i think the expected behaviour for statement 1 and 2 is pretty wired. Statement 4 {quote}select * from T where cid = -100;{quote} Returns: {quote}-100 rose1{quote} And this just affect the large number, the smaller one seemed to be good. Does that look like a bug to you folks ? Thanks. > Wired SELECT equal behaviour. > ------------------------------ > > Key: SPARK-19971 > URL: https://issues.apache.org/jira/browse/SPARK-19971 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.0 > Environment: macOS Sierra > Reporter: Tom Tang > Priority: Critical > > Let say we have a csv /tmp/1.csv : > {code} > cid,name > -100224910923912596,jack > -100224910923912595,tom > -1,rose > -2,marry > -100,rose1 > -101,rose2 > {code} > Use following SQL to define a view in Spark-SQL: > {code} > CREATE TEMPORARY VIEW T > ( > `cid` string, > `name` string > ) > USING CSV > OPTIONS ( > path "/tmp/1.csv" > ); > {code} > Statement 1: > {code}select * from T where cid = -100224910923912596; {code} > Returns: > {code} > -100224910923912596 jack > -100224910923912595 tom > {code} > Statement 2: > {code}select * from T where cid = -100224910923912599;{code} > it also returns: > {code} > -100224910923912596 jack > -100224910923912595 tom > {code} > Unless you do, > {code}select * from T where cid = '-100224910923912596';{code} > It returns: > {code} > -100224910923912596 jack > {code} > However, i think the expected behaviour for statement 1 and 2 is pretty wired. > Statement 4 > {quote}select * from T where cid = -100;{quote} > Returns: > {quote}-100 rose1{quote} > And this just affect the large number, the smaller one seemed to be good. > Does that look like a bug to you folks ? > Thanks. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org