Re: Mysql - Hive Sync
This cant be done since insert update delete are not supported in hive. Mysql Applier for Hadoop package servers the same purpose of the prototype tool which i intended to develop. link for Mysql Applier for Hadoop http://dev.mysql.com/tech-resources/articles/mysql-hadoop-applier.html *Regards Muthupandi.K* [image: Picture (Device Independent Bitmap)] On Wed, Sep 3, 2014 at 10:35 AM, Muthu Pandi muthu1...@gmail.com wrote: Yeah but we cant make it to work as near real time. Also my table doesnt have like 'ID' to use for --check-column that's why opted out of sqoop. *Regards Muthupandi.K* [image: Picture (Device Independent Bitmap)] On Wed, Sep 3, 2014 at 10:28 AM, Nitin Pawar nitinpawar...@gmail.com wrote: have you looked at sqoop? On Wed, Sep 3, 2014 at 10:15 AM, Muthu Pandi muthu1...@gmail.com wrote: Dear All Am developing a prototype of syncing tables from mysql to Hive using python and JDBC. Is it a good idea using the JDBC for this purpose. My usecase will be generating the sales report using the hive, data pulled from mysql using the prototype tool.My data will be around 2GB/day. *Regards Muthupandi.K* [image: Picture (Device Independent Bitmap)] -- Nitin Pawar
Error Configuring Object In Hive Joins
Hi all, I have two hive tables pointing to the same location(folder) when i join these tables hive fails in map-reduce with the error java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 9 more Any idea why this would happen ? -- Sreenath S Kamath Bangalore Ph No:+91-9590989106
Re: MatchPath UDF usage info ?
Hi, Many thanks for sending these links. Looking forward to more documentation around this. BTW, why does hive-exec-0.13.0.2.1.1.0-385.jar not have any class files for MatchPath UDF ? Have they been chucked out to a separate JAR file? I can see that hive-exec-0.13.0.jar has the appropriate class files, and have tried to use them. They work well with the demo data set but we certainly need more documentation around this. Regards, Asif Abbasi On Tue, Aug 26, 2014 at 6:42 AM, Lefty Leverenz leftylever...@gmail.com wrote: Thanks for pointing out that we still need documentation for this in the wiki. (I've added a doc comment to HIVE-5087 https://issues.apache.org/jira/browse/HIVE-5087.) In the meantime, googling Hive npath turned up these sources of information: - https://github.com/hbutani/SQLWindowing/wiki - http://www.slideshare.net/Hadoop_Summit/analytical-queries-with-hive (slides 20-21) - http://www.justinjworkman.com/big-data/using-npath-with-apache-hive/ -- Lefty On Mon, Aug 25, 2014 at 8:27 AM, Muhammad Asif Abbasi asif.abb...@gmail.com wrote: Hi All, I am trying to use MatchPath UDF (Previously called NPath). Does anybody have a document around its syntax and usage? Regards, Asif
Re: Reading and Writing with Hive 0.13 from a Yarn application
Hi Nathan, This was done in https://issues.apache.org/jira/browse/HIVE-6248 Reasoning was to minimize api surface area to users so that they are immune of incompatible changes in internal classes and thus making it easier for them to consume this and not get worried about version upgrade. Seems like in the process some of the functionality went away. Which info you are looking for exactly? Is it String[] getBlockLocations() equivalent of InputSplit? If so, we can consider adding that in ReaderContext() since that one need not to expose any hadoop or hive classes. Thanks, Ashutosh On Tue, Sep 2, 2014 at 5:26 PM, Nathan Bamford nathan.bamf...@redpoint.net wrote: Hi, My company has been working on a Yarn application for a couple of years-- we essentially take the place of MapReduce and split our data and processing ourselves. One of the things we've been working to support is Hive access, and the HCatalog interfaces and API seemed perfect. Using this information: https://hive.apache.org/javadocs/hcat-r0.5.0/readerwriter.html https://hive.apache.org/javadocs/hcat-r0.5.0/readerwriter.html and TestReaderWriter.java from the source code, I was able to create and use HCatSplits to allow balanced data local parallel reading (using the size and locations methods available from each HCatSplit). Much to my dismay, 0.13 removes a lot of that functionality. The ReaderContext class is now an interface that only exposes numSplits, whereas all of the other methods are in the inaccessible (package only) ReaderContextImpl class. Since I no longer have access to the actual HCatSplits from the ReaderContext, I am unable to process them and send them to our yarn app on the data local nodes. My only choice seems to be to partition out the splits to slave nodes more or less at random. Does anyone know if, as of 0.13, this is the intended way to interface with Hive via non-Hadoop yarn applications? Is the underlying HCatSplit only intended for internal use, now? Thanks, Nathan Bamford
Re: MatchPath UDF usage info ?
Hi Muhammad, From what I've googled a few months ago on the subject, MatchPath UDF has been removed from Cloudera and Hortonworks releases because TeraData claims it violates one of their patent (apparently renaming it did not suffice). I guess that if you really need it, it might be possible to add it yourself as an external UDF since the code is still available out there, but I have no idea whether TeraData would have the right to come after you (or not?) if you do. By the way, if anyone has news on the current situation with MatchPath and TerraData, that would be welcome. Furcy 2014-09-03 17:18 GMT+02:00 Muhammad Asif Abbasi asif.abb...@gmail.com: Hi, Many thanks for sending these links. Looking forward to more documentation around this. BTW, why does hive-exec-0.13.0.2.1.1.0-385.jar not have any class files for MatchPath UDF ? Have they been chucked out to a separate JAR file? I can see that hive-exec-0.13.0.jar has the appropriate class files, and have tried to use them. They work well with the demo data set but we certainly need more documentation around this. Regards, Asif Abbasi On Tue, Aug 26, 2014 at 6:42 AM, Lefty Leverenz leftylever...@gmail.com wrote: Thanks for pointing out that we still need documentation for this in the wiki. (I've added a doc comment to HIVE-5087 https://issues.apache.org/jira/browse/HIVE-5087.) In the meantime, googling Hive npath turned up these sources of information: - https://github.com/hbutani/SQLWindowing/wiki - http://www.slideshare.net/Hadoop_Summit/analytical-queries-with-hive (slides 20-21) - http://www.justinjworkman.com/big-data/using-npath-with-apache-hive/ -- Lefty On Mon, Aug 25, 2014 at 8:27 AM, Muhammad Asif Abbasi asif.abb...@gmail.com wrote: Hi All, I am trying to use MatchPath UDF (Previously called NPath). Does anybody have a document around its syntax and usage? Regards, Asif
Re: MatchPath UDF usage info ?
Hi Furcy, Many thanks for your email :) My latest info was that the rename took place due to objections by Teradata, but didn't know if they had actually requested to take it off the distribution entirely. Does anybody else have an idea on the licensing aspect of this? What exactly has Teradata patented? Is it the technique to parse the rows in a such a manner? Any tips/techniques would be highly appreciated. Regards, Asif Abbasi On Wed, Sep 3, 2014 at 5:30 PM, Furcy Pin furcy@flaminem.com wrote: Hi Muhammad, From what I've googled a few months ago on the subject, MatchPath UDF has been removed from Cloudera and Hortonworks releases because TeraData claims it violates one of their patent (apparently renaming it did not suffice). I guess that if you really need it, it might be possible to add it yourself as an external UDF since the code is still available out there, but I have no idea whether TeraData would have the right to come after you (or not?) if you do. By the way, if anyone has news on the current situation with MatchPath and TerraData, that would be welcome. Furcy 2014-09-03 17:18 GMT+02:00 Muhammad Asif Abbasi asif.abb...@gmail.com: Hi, Many thanks for sending these links. Looking forward to more documentation around this. BTW, why does hive-exec-0.13.0.2.1.1.0-385.jar not have any class files for MatchPath UDF ? Have they been chucked out to a separate JAR file? I can see that hive-exec-0.13.0.jar has the appropriate class files, and have tried to use them. They work well with the demo data set but we certainly need more documentation around this. Regards, Asif Abbasi On Tue, Aug 26, 2014 at 6:42 AM, Lefty Leverenz leftylever...@gmail.com wrote: Thanks for pointing out that we still need documentation for this in the wiki. (I've added a doc comment to HIVE-5087 https://issues.apache.org/jira/browse/HIVE-5087.) In the meantime, googling Hive npath turned up these sources of information: - https://github.com/hbutani/SQLWindowing/wiki - http://www.slideshare.net/Hadoop_Summit/analytical-queries-with-hive (slides 20-21) - http://www.justinjworkman.com/big-data/using-npath-with-apache-hive/ -- Lefty On Mon, Aug 25, 2014 at 8:27 AM, Muhammad Asif Abbasi asif.abb...@gmail.com wrote: Hi All, I am trying to use MatchPath UDF (Previously called NPath). Does anybody have a document around its syntax and usage? Regards, Asif
Hive hwi setup windows 8 using cygwin
Dear All, Kindly help with hwi setup, I am using cygwin on windows 8, jdk 1.7, ant 1.9. I have successfully built hive and able to start the hwi war using command hive --service hwi. It shows Started SocketConnector@0.0.0.0: in console. Problem I am facing is when I open the browser and try to access the hwi url (localhost:/hwi) I get error in console as below and the webpage also displays the stack trace of same error. Error :- 2014-09-03 22:48:36,475 ERROR [1231565732@qtp-362433913-0] compiler.Compiler (AntCompiler.java:generateClass(225)) - Javac exception Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK. It is currently set to C:\Program Files\Java\jdk1.7.0_51\jre at org.apache.tools.ant.taskdefs.compilers.CompilerAdapterFactory.getCompiler(CompilerAdapterFactory.java:129) Have tried all path related issues solutions for ant, yet the problem is there. Have checked ant also, its working fine in cygwin and windows both,I am able to run all the lifecycle tasks from ant. Please help. Thanks Prashant
RE: Reading and Writing with Hive 0.13 from a Yarn application
?Hi Ashutosh, Thanks for the reply! Well, we are a yarn app that is essentially doing the same things mapreduce does. For regular files in Hadoop, we get the block locations and sizes and perform some internal sorting and load balancing on the master which then creates the slave yarn apps on individual nodes for reading. We strive for data locality, as much as possible. To interface with Hive, the HCatalog api seemed like the appropriate interface. It does a lot of things we want via the ReadEntity, allowing us to query and read the Hive tables at a high level. I used the readerwriter example (from Hive 0.12) to get things running, but I was using HCatSplit just like our internal split classes. I retrieved them from the ReaderContext and ran them through the same sorting algorithms, then serialized them and sent them to the individual yarn apps, etc. I understand the rationale for the smaller api, which is why I wondered if there's another avenue I should be pursuing as a yarn app (metadata vs. HCatalog, for instance). All that being said :), the ability to get the block locations (and sizes, if possible) would certainly solve my problems. Thanks, Nathan From: Ashutosh Chauhan hashut...@apache.org Sent: Wednesday, September 3, 2014 9:16 AM To: user@hive.apache.org Subject: Re: Reading and Writing with Hive 0.13 from a Yarn application Hi Nathan, This was done in https://issues.apache.org/jira/browse/HIVE-6248 Reasoning was to minimize api surface area to users so that they are immune of incompatible changes in internal classes and thus making it easier for them to consume this and not get worried about version upgrade. Seems like in the process some of the functionality went away. Which info you are looking for exactly? Is it String[] getBlockLocations() equivalent of InputSplit? If so, we can consider adding that in ReaderContext() since that one need not to expose any hadoop or hive classes. Thanks, Ashutosh On Tue, Sep 2, 2014 at 5:26 PM, Nathan Bamford nathan.bamf...@redpoint.netmailto:nathan.bamf...@redpoint.net wrote: Hi, My company has been working on a Yarn application for a couple of years-- we essentially take the place of MapReduce and split our data and processing ourselves. One of the things we've been working to support is Hive access, and the HCatalog interfaces and API seemed perfect. Using this information: https://hive.apache.org/javadocs/hcat-r0.5.0/readerwriter.html https://hive.apache.org/javadocs/hcat-r0.5.0/readerwriter.html and TestReaderWriter.java from the source code, I was able to create and use HCatSplits to allow balanced data local parallel reading (using the size and locations methods available from each HCatSplit). Much to my dismay, 0.13 removes a lot of that functionality. The ReaderContext class is now an interface that only exposes numSplits, whereas all of the other methods are in the inaccessible (package only) ReaderContextImpl class. Since I no longer have access to the actual HCatSplits from the ReaderContext, I am unable to process them and send them to our yarn app on the data local nodes. My only choice seems to be to partition out the splits to slave nodes more or less at random. Does anyone know if, as of 0.13, this is the intended way to interface with Hive via non-Hadoop yarn applications? Is the underlying HCatSplit only intended for internal use, now? Thanks, Nathan Bamford
Re: Reading and Writing with Hive 0.13 from a Yarn application
This api is designed for use cases like yours only. So, I will say api is failing if it cannot service what you are trying to do with it. So, I will encourage you to use this api and consider current shortcoming as missing feature in it. Feel free to file a jira requesting addition of these methods in ReaderContext. Patches are welcome too :) Hope it helps, Ashutosh On Wed, Sep 3, 2014 at 11:12 AM, Nathan Bamford nathan.bamf...@redpoint.net wrote: Hi Ashutosh, Thanks for the reply! Well, we are a yarn app that is essentially doing the same things mapreduce does. For regular files in Hadoop, we get the block locations and sizes and perform some internal sorting and load balancing on the master which then creates the slave yarn apps on individual nodes for reading. We strive for data locality, as much as possible. To interface with Hive, the HCatalog api seemed like the appropriate interface. It does a lot of things we want via the ReadEntity, allowing us to query and read the Hive tables at a high level. I used the readerwriter example (from Hive 0.12) to get things running, but I was using HCatSplit just like our internal split classes. I retrieved them from the ReaderContext and ran them through the same sorting algorithms, then serialized them and sent them to the individual yarn apps, etc. I understand the rationale for the smaller api, which is why I wondered if there's another avenue I should be pursuing as a yarn app (metadata vs. HCatalog, for instance). All that being said :), the ability to get the block locations (and sizes, if possible) would certainly solve my problems. Thanks, Nathan -- *From:* Ashutosh Chauhan hashut...@apache.org *Sent:* Wednesday, September 3, 2014 9:16 AM *To:* user@hive.apache.org *Subject:* Re: Reading and Writing with Hive 0.13 from a Yarn application Hi Nathan, This was done in https://issues.apache.org/jira/browse/HIVE-6248 Reasoning was to minimize api surface area to users so that they are immune of incompatible changes in internal classes and thus making it easier for them to consume this and not get worried about version upgrade. Seems like in the process some of the functionality went away. Which info you are looking for exactly? Is it String[] getBlockLocations() equivalent of InputSplit? If so, we can consider adding that in ReaderContext() since that one need not to expose any hadoop or hive classes. Thanks, Ashutosh On Tue, Sep 2, 2014 at 5:26 PM, Nathan Bamford nathan.bamf...@redpoint.net wrote: Hi, My company has been working on a Yarn application for a couple of years-- we essentially take the place of MapReduce and split our data and processing ourselves. One of the things we've been working to support is Hive access, and the HCatalog interfaces and API seemed perfect. Using this information: https://hive.apache.org/javadocs/hcat-r0.5.0/readerwriter.html https://hive.apache.org/javadocs/hcat-r0.5.0/readerwriter.html and TestReaderWriter.java from the source code, I was able to create and use HCatSplits to allow balanced data local parallel reading (using the size and locations methods available from each HCatSplit). Much to my dismay, 0.13 removes a lot of that functionality. The ReaderContext class is now an interface that only exposes numSplits, whereas all of the other methods are in the inaccessible (package only) ReaderContextImpl class. Since I no longer have access to the actual HCatSplits from the ReaderContext, I am unable to process them and send them to our yarn app on the data local nodes. My only choice seems to be to partition out the splits to slave nodes more or less at random. Does anyone know if, as of 0.13, this is the intended way to interface with Hive via non-Hadoop yarn applications? Is the underlying HCatSplit only intended for internal use, now? Thanks, Nathan Bamford
How to use joins and averages both in the same hive query
I have two tables in hive: Table1: uid,txid,amt,vendor Table2: uid,txid Now I need to join the tables on txid which basically confirms a transaction is finally recorded. There will be some transactions which will be present only in Table1 and not in Table2. I need to find out number of avg of transaction matches found per user(uid) per vendor. Then I need to find the avg of these averages by adding all the averages and divide them by the number of unique users per vendor. Let's say I have the data: Table1: u1,120,44,vend1 u1,199,33,vend1 u1,100,23,vend1 u1,101,24,vend1 u2,200,34,vend1 u2,202,32,vend2 Table2: u1,100 u1,101 u2,200 u2,202 Example For vendor vend1: u1- Avg transaction find rate = 2(matches found in both Tables,Table1 and Table2)/4(total occurrence in Table1) =0.5 u2 - Avg transaction find rate = 1/1 = 1 Avg of avgs = 0.5+1(sum of avgs)/2(total unique users) = 0.75 Required output: vend1,0.75 vend2,1 I can't seem to find count of both matches and occurrence in just Table1 in one hive query per user per vendor. I have reached to this query and can't find how to change it further. SELECT A.vendor,A.uid,count(*) as totalmatchesperuser FROM Table1 A JOIN Table2 B ON A.uid = B.uid AND B.txid =A.txid group by vendor,A.uid Any help would be great.
Re: Collect_set() of non-primitive types
I don't know of anything like what you want atleast until Hive 0.11. However, you could try something like this: INSERT OVERWRITE TABLE rollup SELECT id, start_time, collect_set(concat_ws(,, objects.name, objects.value, objects.type)) AS product_details FROM bar GROUP BY id, start_time; It's a bit hacky, but it does the trick. It basically concats all the objects you care about in a string, and then collects a set of that. Then, when you want to extract a field from the product_details array, just do this (say you want to extract the very first product's name in each array): SELECT SPLIT(a[0].product_details)[0] AS first_name FROM rollup a; Hope that helps! Best Regards, Nishant Kelkar On Wed, Sep 3, 2014 at 1:47 PM, anusha Mangina anusha.mang...@gmail.com wrote: I have a table defined as: CREATE TABLE foo ( id INT, start_time STRING, name STRING, value STRING, type STRING ) The 'name', 'value' and 'type' fields actually describe another object, and I'd like to turn these into STRUCTs, something like: CREATE TABLE bar ( id INT, start_time STRING, object STRUCTname: STRING, value: STRING, type: STRING ) However, I'd also like to create a rollup table containing the most recent result for a given 'id' field. There can be multiple different 'object' structs for a given combination of 'id' and 'start_time', so I thought I'd turn this into an array of structs. My ideal rollup table would look like: CREATE TABLE rollup ( id INT, start_time STRING, objects ARRAYSTRUCTname: STRING, value: STRING, type: STRING ) However, I can't do this because the following query fails: INSERT OVERWRITE TABLE rollup SELECT id, start_time, collect_set(object) FROM bar GROUP BY id, start_time Here's the error I get: FAILED: UDFArgumentTypeException Only primitive type arguments are accepted but structname:string,value:string,type:string was passed as parameter 1. Is there any way I can do this?
Re: Collect_set() of non-primitive types
Geez, too many mistakes for the day :P I meant the following above *CREATE TABLE* rollup_new *AS* SELECT id, start_time, collect_set(concat_ws(,, object.name http://objects.name/, object.value, object.type)) AS product_details FROM bar GROUP BY id, start_time; The change is in the table creation query, since an INSERT OVERWRITE would expect a struct and get a string instead, thus throwing errors right? On Wed, Sep 3, 2014 at 2:06 PM, Nishant Kelkar nishant@gmail.com wrote: Sorry, I meant the following in my example: INSERT OVERWRITE TABLE rollup SELECT id, start_time, collect_set(concat_ws(,, object.name http://objects.name/, object.value, object.type)) AS product_details FROM bar GROUP BY id, start_time; object instead of object*s*, as you have in your example :) Best Regards, Nishant Kelkar On Wed, Sep 3, 2014 at 2:03 PM, Nishant Kelkar nishant@gmail.com wrote: I don't know of anything like what you want atleast until Hive 0.11. However, you could try something like this: INSERT OVERWRITE TABLE rollup SELECT id, start_time, collect_set(concat_ws(,, objects.name, objects.value, objects.type)) AS product_details FROM bar GROUP BY id, start_time; It's a bit hacky, but it does the trick. It basically concats all the objects you care about in a string, and then collects a set of that. Then, when you want to extract a field from the product_details array, just do this (say you want to extract the very first product's name in each array): SELECT SPLIT(a[0].product_details)[0] AS first_name FROM rollup a; Hope that helps! Best Regards, Nishant Kelkar On Wed, Sep 3, 2014 at 1:47 PM, anusha Mangina anusha.mang...@gmail.com wrote: I have a table defined as: CREATE TABLE foo ( id INT, start_time STRING, name STRING, value STRING, type STRING ) The 'name', 'value' and 'type' fields actually describe another object, and I'd like to turn these into STRUCTs, something like: CREATE TABLE bar ( id INT, start_time STRING, object STRUCTname: STRING, value: STRING, type: STRING ) However, I'd also like to create a rollup table containing the most recent result for a given 'id' field. There can be multiple different 'object' structs for a given combination of 'id' and 'start_time', so I thought I'd turn this into an array of structs. My ideal rollup table would look like: CREATE TABLE rollup ( id INT, start_time STRING, objects ARRAYSTRUCTname: STRING, value: STRING, type: STRING ) However, I can't do this because the following query fails: INSERT OVERWRITE TABLE rollup SELECT id, start_time, collect_set(object) FROM bar GROUP BY id, start_time Here's the error I get: FAILED: UDFArgumentTypeException Only primitive type arguments are accepted but structname:string,value:string,type:string was passed as parameter 1. Is there any way I can do this?
Re: How to use joins and averages both in the same hive query
try this: SELECT A.vendor, AVG(totalmatchesperuser) as avgmatches FROM (SELECT A.vendor, A.uid, count(*) as totalmatchesperuser FROM Table1 A INNER JOIN Table2 B ON A.uid = B.uid AND B.txid =A.txid GROUP BY A.vendor, A.uid ) t GROUP BY A.vendor On Thu, Sep 4, 2014 at 3:38 AM, Mohit Durgapal durgapalmo...@gmail.com wrote: I have two tables in hive: Table1: uid,txid,amt,vendor Table2: uid,txid Now I need to join the tables on txid which basically confirms a transaction is finally recorded. There will be some transactions which will be present only in Table1 and not in Table2. I need to find out number of avg of transaction matches found per user(uid) per vendor. Then I need to find the avg of these averages by adding all the averages and divide them by the number of unique users per vendor. Let's say I have the data: Table1: u1,120,44,vend1 u1,199,33,vend1 u1,100,23,vend1 u1,101,24,vend1 u2,200,34,vend1 u2,202,32,vend2 Table2: u1,100 u1,101 u2,200 u2,202 Example For vendor vend1: u1- Avg transaction find rate = 2(matches found in both Tables,Table1 and Table2)/4(total occurrence in Table1) =0.5 u2 - Avg transaction find rate = 1/1 = 1 Avg of avgs = 0.5+1(sum of avgs)/2(total unique users) = 0.75 Required output: vend1,0.75 vend2,1 I can't seem to find count of both matches and occurrence in just Table1 in one hive query per user per vendor. I have reached to this query and can't find how to change it further. SELECT A.vendor,A.uid,count(*) as totalmatchesperuser FROM Table1 A JOIN Table2 B ON A.uid = B.uid AND B.txid =A.txid group by vendor,A.uid Any help would be great. -- 丁桂涛