[android-developers] Re: Display list of saved tables to be opened

Bob Kerns Sat, 23 Apr 2011 15:17:45 -0700

It's a database design question, not an Android question. That's not evading 
answering you -- it's probably the most helpful thing I'll tell you! 
Understanding that your question really has nothing in it related to Android 
will help you find the information you need, beyond anything I can tell you.


It's also a fairly basic database schema design problem. You clearly don't 
know the basics of how to design the schema for a database -- so that's what 
you need to go learn. That's really important knowledge for any programmer 
these days, so it'll be well worth your time.

With that said, I'll try to get you started, to connect your problem to the 
concepts.

SQL (and SQLite) databases are *relational* databases. That means, each 
table defines a set of relationships, and you represent your data in terms 
of the relationships. Each table defines a particular type of relationship; 
each column in a table defines a particular role in the relationship; each 
row is a separate relationship of that type.

Generally, I would NOT save each "project" in a separate table in the 
database. Let's look at the concrete concepts you have. (These are sometimes 
called "entities" in database design).

1) Project.
2) A main 2D array. You don't say clearly what it contains, but I infer it 
may be a 2D array of strings, so I'll go with that.
3) A parallel 2D array "A". Again, it's a bit unclear, but it appears it may 
be also a 2D array of strings.
4) A parallel 2D array "B" -- likewise.

So let's define our tables. I'm going to assume that we want fully-general 
2D arrays here, of arbitrary size and content, and that we'll actually use 
the database to represent them. That will illustrate databases a bit better 
than the alternative of encoding your data into a big string, and decoding 
it after you retrieve it.

I'm NOT going to go into the details enough to give you a concrete schema -- 
partly to save my time, and partly because you need to go read up on that 
anyway.

The logical place to start here is the Project table. But I'm not going to 
do that, because I want to illustrate a thinking process and strategy you 
need to be able to go through, called "normalization". So if you are feeling 
that's where we should start, you're right, but be patient.

Let's instead start with your main 2D array. A 2D array is logically a 
relationship between two positive integer coordinates -- let's call them I 
and J -- and a value, which we'll call V since I don't know what your data 
means. Further, each of these bits of data is associated with a project, 
let's call that Project_Name

So we have the following Array_Main
Project_Name I J V
Work         0 0 Elephant
Work         0 1 Fish
Work         0 2 Cow
Work         1 0 Pig
Work         1 1 Sheep
Work         1 2 Horse
Play         0 0 Cat
Play         0 1 Dog
Play         1 0 Hamster
Play         1 1 Iguana

OK, you can see we've represented two projects here and their main arrays, 
of two different shapes. We can do things like:

SELECT I, J, V FROM Array_Main
WHERE Project_Name = 'Work'

to get all of the main data for the project named 'Work'.

We can also select just a part of it, for example:

SELECT I, J, V  FROM Array_Main
WHERE Project_Name = 'Work'
    AND  I < 2 AND J < 2;

(Thus illustrating we can store and use more data than we need to actually 
load at any one time).

Now we can do the same for the other two arrays, in two additional tables, 
Array_A and  Array_B.

OK, now let's take a look at what we have -- we have a piece of duplicated 
information in every row, in three tables -- the Project_Name. Collapsing 
this sort of duplication is called "normalization".  It's a whole topic of 
its own, but here what we want to do is pretty simple: Replace this string 
with a project ID, and have a separate table that establishes the 
relationship between project ID and name.

This has a number of benefits, most of which I won't go into -- but one very 
obvious one is, you can now rename the project, without  touching every data 
row in the database! It's also more efficient, extensible, etc. etc.

So now we have a table Project:
Project_ID Project_Name
         1 Work
         2 Play

and Array_Main (and _A and _B):
Project_ID I J V
1          0 0 Elephant
1          0 1 Fish
1          0 2 Cow
1          1 0 Pig
1          1 1 Sheep
1          1 2 Horse
1          0 0 Cat
1          0 1 Dog
1          1 0 Hamster
1          1 1 Iguana

Now, we can select our main data thus:

SELECT I, J, V  FROM Array_MAIN
INNER JOIN Project ON Array_Main.Project_ID = Project.Project_ID
WHERE Project_Name = 'Work';.

We can even write:

SELECT Array_Main.I, Array_Main.J, Array_Main.V as MAIN, Array_A.V as A, 
Array_B.V as B FROM Project
INNER JOIN Array_Main ON  Project.Project_ID = Array_Main.Project_ID
INNER JOIN Array_A ON Project.Project_ID = Array_A.Project_ID AND 
Array_Main.I = Array_A.I AND Array_Main.J = Array_A.J
INNER JOIN Array_B ON Project.Project_ID = Array_B.Project_ID AND 
Array_Main.I = Array_B.I AND Array_Main.J = Array_B.J
WHERE Project_Name = 'Work'

and get all three strings for each coordinate simultaneously. The result 
table looks like this:
I J MAIN A B

Whether that's a good idea or not, depends on what you're doing -- but in 
general, databases do a good job of optimizing your accesses, so often it's 
more efficient than what you'd do yourself.

We're not done yet, though. We have some  work to do that will make this 
more efficient, and also make it more robust.

Let's start with the project table. Our Project_ID is a simple value, that 
we'll be looking up a lot. This makes it a good candidate for being the 
"primary key". Primary keys are always unique, and that's something we'd 
like to enforce.

We reference Project.Project_ID from our various other tables. We only want 
to reference a valid project, and we want to make that as efficient as 
possible. We do this via a foreign key constraint. This says, that (for 
example), Main_A.Project_ID always refers to a Project.Project_ID value (and 
thus, a specific Project row). Further, we can specify that if a particular 
Project_ID is deleted or changed, we can specify what should happen to our 
data. In this case, if you delete a project, you also want that project's 
data to go, so we'd want to specify cascading delete.

Similarly, we only want one project per project name. We do this by 
specifying an index on the Project table, listing just the Project_Name 
field, and indicating it should be UNIQUE. This will prevent duplicates. But 
it ALSO gives the database an efficient way to find projects by name, even 
if there are thousands or millions of them!

Finally, for each of our data tables, we note that the combination of 
Project_ID, I, and J should be unique. A similar 3-element index will 
enforce that constraint, and also greatly speed up that complicated query I 
gave above, as it gives the database information about the contents in 
sorted order.

It's not so hard once you get the hang of it, but everyone starts off not 
knowing what mindset they need, and the concepts (index, primary key, 
unique, etc. etc.) and the associated terminology.

One final bit of advice: Do not, if you can at all avoid it, construct SQL 
expressions by concatenating strings at runtime. Instead, use any available 
mechanism (e.g. prepared statements, SQLIteQueryBuilder/SQLiteQuery, etc.) 
to construct a parameterized query that then reuse, filling in the various 
fields. For example, if the user supplies the Project_Name, you should be 
using a query parameter for the value to match, rather than construction a 
query with Project_Name = 'Work'. Not only is this potentially more 
efficient, it avoids syntactic problems, which can result in program failure 
and/or security holes. (This category of security problem is termed a 'SQL 
Injection Attack'. See, for example, the case of Little Bobby Tables: 
http://xkcd.com/327/ for how this can be used).


-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Display list of saved tables to be opened

Reply via email to