yeah. database design is always subjective so everybody has an opinion
about it. but if you're just starting out i would recommend you kinda
follow the rules as you would in a traditional relational database system.
so two different datasets would mean two different tables in both Hive and
an Rdb database.

Start there anyway and get your feet wet. :)


On Wed, Aug 21, 2013 at 7:24 AM, Chris Driscol <cdris...@rallydev.com>wrote:

> Hi -
> I just started to get my feet wet with Hive and have a question that I
> have not been able to find an answer to..
>
> Suppose I have 2 CSV files:
> >cat Schema1.csv
> Name, Address, Phone
> Chris, address1, 999-999-9999
>
> and
> >cat Schema2.csv
> Id, Name, Address, Gender, Phone
> 13, Tom, address2, male, 888-888-8888
>
> I put these two files into Hadoop and want to be able to query these 2
> different schema's via Hive..
>
> Do I need to create two tables in Hive to represent both schemas and use a
> join?  Or is there a better way that can handle these two different schemas?
>
> Please reply back with any other specific questions, I realize this is
> somewhat open-ended..  thanks!
>
> --
> -cd
>

Reply via email to