[ https://issues.apache.org/jira/browse/SPARK-12218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048619#comment-15048619 ]
Irakli Machabeli edited comment on SPARK-12218 at 12/9/15 1:47 PM: ------------------------------------------------------------------- In [13]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and not( PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff', 'PreviouslyChargedOff'))").explain(True) == Parsed Logical Plan == 'Filter (('LoanID = 62231) && NOT (('PaymentsReceived = 0) && 'ExplicitRoll IN (PreviouslyPaidOff,PreviouslyChargedOff))) Relation[BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583] ParquetRelation[file:/d:/MktLending/prp_enh1] == Analyzed Logical Plan == BorrowerRate: double, MnthRate: double, ObservationMonth: date, CycleCounter: int, LoanID: int, Loankey: string, OriginationDate: date, OriginationQuarter: string, LoanAmount: double, Term: int, LenderRate: double, ProsperRating: string, ScheduledMonthlyPaymentAmount: double, ChargeoffMonth: date, ChargeoffAmount: double, CompletedMonth: date, MonthOfLastPayment: date, PaymentsReceived: double, CollectionFees: double, PrincipalPaid: double, InterestPaid: double, LateFees: double, ServicingFees: double, RecoveryPayments: double, RecoveryPrin: double, DaysPastDue: int, PriorMonthDPD: int, ExplicitRoll: string, SummaryRoll: string, CumulPrin: double, EOMPrin: double, ScheduledPrinRemaining: double, ScheduledCumulPrin: double, ScheduledPeriodicPrin: double, BOMPrin: double, ListingNumber: int, DebtSaleMonth: int, GrossCashFromDebtSale: double, DebtSaleFee: double, NetCashToInvestorsFromDebtSale: double, OZVintage: string Filter ((LoanID#8547 = 62231) && NOT ((PaymentsReceived#8560 = cast(0 as double)) && ExplicitRoll#8570 IN (PreviouslyPaidOff,PreviouslyChargedOff))) Relation[BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583] ParquetRelation[file:/d:/MktLending/prp_enh1] == Optimized Logical Plan == Filter ((LoanID#8547 = 62231) && NOT ((PaymentsReceived#8560 = 0.0) && ExplicitRoll#8570 IN (PreviouslyPaidOff,PreviouslyChargedOff))) Relation[BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583] ParquetRelation[file:/d:/MktLending/prp_enh1] == Physical Plan == Filter ((LoanID#8547 = 62231) && NOT ((PaymentsReceived#8560 = 0.0) && ExplicitRoll#8570 IN (PreviouslyPaidOff,PreviouslyChargedOff))) Scan ParquetRelation[file:/d:/MktLending/prp_enh1][BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583] Code Generation: true In [14]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and ( not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff', 'PreviouslyChargedOff')))").explain(True) == Parsed Logical Plan == 'Filter (('LoanID = 62231) && (NOT ('PaymentsReceived = 0) || NOT 'ExplicitRoll IN (PreviouslyPaidOff,PreviouslyChargedOff))) Relation[BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624] ParquetRelation[file:/d:/MktLending/prp_enh1] == Analyzed Logical Plan == BorrowerRate: double, MnthRate: double, ObservationMonth: date, CycleCounter: int, LoanID: int, Loankey: string, OriginationDate: date, OriginationQuarter: string, LoanAmount: double, Term: int, LenderRate: double, ProsperRating: string, ScheduledMonthlyPaymentAmount: double, ChargeoffMonth: date, ChargeoffAmount: double, CompletedMonth: date, MonthOfLastPayment: date, PaymentsReceived: double, CollectionFees: double, PrincipalPaid: double, InterestPaid: double, LateFees: double, ServicingFees: double, RecoveryPayments: double, RecoveryPrin: double, DaysPastDue: int, PriorMonthDPD: int, ExplicitRoll: string, SummaryRoll: string, CumulPrin: double, EOMPrin: double, ScheduledPrinRemaining: double, ScheduledCumulPrin: double, ScheduledPeriodicPrin: double, BOMPrin: double, ListingNumber: int, DebtSaleMonth: int, GrossCashFromDebtSale: double, DebtSaleFee: double, NetCashToInvestorsFromDebtSale: double, OZVintage: string Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = cast(0 as double)) || NOT ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff))) Relation[BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624] ParquetRelation[file:/d:/MktLending/prp_enh1] == Optimized Logical Plan == Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = 0.0) || NOT ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff))) Relation[BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624] ParquetRelation[file:/d:/MktLending/prp_enh1] == Physical Plan == Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = 0.0) || NOT ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff))) Scan ParquetRelation[file:/d:/MktLending/prp_enh1][BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624] Code Generation: true was (Author: imachabeli): I'm afraid I don't really know what that means, "plan by explain(true)" Shall I type it in repl? [ https://issues.apache.org/jira/browse/SPARK-12218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047928#comment-15047928 ] Xiao Li commented on SPARK-12218: --------------------------------- Could you provide the plan by explain(true)? [~imachabeli] Thanks! "(not A) or (not B)" -------------------------------------------------------------------------------------------- PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff', 'PreviouslyChargedOff'))").count() not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff', 'PreviouslyChargedOff')))").count() -- This message was sent by Atlassian JIRA (v6.3.4#6332) > Boolean logic in sql does not work "not (A and B)" is not the same as "(not > A) or (not B)" > -------------------------------------------------------------------------------------------- > > Key: SPARK-12218 > URL: https://issues.apache.org/jira/browse/SPARK-12218 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.2 > Reporter: Irakli Machabeli > Priority: Blocker > > Two identical queries produce different results > In [2]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and not( > PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff', > 'PreviouslyChargedOff'))").count() > Out[2]: 18 > In [3]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and ( > not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff', > 'PreviouslyChargedOff')))").count() > Out[3]: 28 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org