Hi All, I need help with speeding up my samples collection which is
processed using C# codes and SQLite. I am doing a random sampling
experiment using sqlite to find the most dominant types of lab. findings.
We have about 26 years of lab experiments data provided  in a table such as:

CREATE TABLE [Data] (
[ExperimentNumber] INTEGER  NOT NULL,
[ExperimentDate] DATE  NOT NULL,
[Type1] INTEGER  NOT NULL,
[Type2] INTEGER  NOT NULL,
[Type3] INTEGER  NOT NULL
)

Actually we have 22 'Type' fields but  for simplicity, we assume to have
only 3 'Type'  fields. Moreover, the types found in each experiment are
separated as well as combined and sorted in the field of CType in table
Types below.

CREATE TABLE [Types] (
[ExperimentNumber] INTEGER  NOT NULL,
[ExperimentDate] DATE  NOT NULL,
[CType] TEXT  NOT NULL
)

For example, if we have in table Data as:
insert into Data values(12, '1989-02-29', 88, 25, 10);

then we have in table Types precisely as
insert into Types values(12. '1989-02-29', 88);
insert into Types values(12. '1989-02-29', 25);
insert into Types values(12. '1989-02-29', 10);
insert into Types values(12. '1989-02-29', 10|25|);
insert into Types values(12. '1989-02-29', 10|88);
insert into Types values(12. '1989-02-29', 25|88);
insert into Types values(12. '1989-02-29', 10|25|88);


To find the most dominant 'Types', sample experiments are selected randomly
from a set of timeframes which is also defined randomly. Most frequent
'Types' found from the selected samples are then recorded and compared to a
randomly selected experiment number whose experiment date is later than any
experiment date of the selected samples. Matched 'Type' are recorded into a
database table for further analysis. So, I created the following tables:

CREATE TABLE [SampleDetects] (
[ExperimentRanges] TEXT  NOT NULL,
[ExperimentNumToMatch] TEXT  NOT NULL
)

CREATE TABLE [SampleRanges] (
[Samplerangeid] INTEGER  NOT NULL,
[start] INTEGER NOT NULL,
[end] INTEGER NOT NULL
)

CREATE TABLE [TopSample] (
[CType] TEXT  NOT NULL,
[frequency] INTEGER  NOT NULL,
[Samplerangeid] INTEGER  NOT NULL
)

CREATE TABLE [TopSampleMatches] (
[matchedCType] TEXT  NOT NULL,
[Samplerangeid] INTEGER  NOT NULL,
[Experimentno] INTEGER NOT NULL
)

While the sqlite3.dll wrapper functions are not displayed here, but  the
following  C# code should be comprehensive :
<CODE>

        void Sampling(string database, int samplesize, int  MaxTimeframes,
int noofExpToMatch)
        {
            random = new Random();

            Classes.clsSQLite3DLL sqlite3 = new Classes.clsSQLite3DLL();
//sqlite3.dll wrapper

            int count = 0;  //counter to increment
            int minexpno = sqlite3.getMinMaxExpNo(database, "Min");
//Obtain min. experiment no.
            int maxexpno = sqlite3.getMinMaxExpNo(database, "Max");
//...and max. experiment no.
            int Samplerangeid = sqlite3.getLastSampleRangeID(database);
//get the last samplerangeid in the database

            while (count < samplesize)
            {
                //new cycle initialization
starts---------------------------------------------------------------------------------------

                //the whole experiment population is divided into no of
time frames
                int nooftimeframes = random.Next(1, MaxTimeframes);
//define the number of timeframes

                //each timeframe has an equal interval
                //minus 20 is placed to ensure availability of later
experiments to compare with
                int interval = Convert.ToInt32((maxexpno - minexpno - 20) /
nooftimeframes);

                int[] start = new int[nooftimeframes]; //to store the
lowest experiment no. in each timeframe
                int[] end = new int[nooftimeframes]; //to store the highest
experiment no. in each timeframe

                int _minexpno = minexpno; //the lowest  experiment number
found in the database

                int maxnum = 0;  //the maximum exepriment number of all the
randomly selected samples

                string Expnumbers = string.Empty;  //to store the starts
and the ends of experiment numbers of each intervals.
                         //This string will be inserted into table
SampleDetects to ensure only unique values collected.

                int matchcount = 0; //represents the number of experiments
to compare with for incremental purpose
                int ExpNo_tomatch = 0; //to store random experiment number
to compare with

                string sqlmid = string.Empty;
                string sqlhead = string.Empty;
                string sqltail = " group by CType order by count(*) desc;";
                string sql = string.Empty;

                 //initialization
ends-----------------------------------------------------------------------------------------------------


                //select range of experiment numbers for each timeframe
                for (int j = 0; j < nooftimeframes; ++j)
                {
                    start[j] = random.Next(_minexpno, _minexpno + interval);
                    end[j] = random.Next(start[j] + 1, _minexpno +
interval);
                    _minexpno = _minexpno + interval + 1;

                    if (end[j] > maxnum)
                        maxnum = end[j];  //to store maximum of experiment
no. in the collected ranges

                    if (j != 0)
                    {
                        sqlmid += " or ";
                        Expnumbers += ", ";
                    }
                    sqlmid += "ExperimentNumber between " + start[j] + "
and " + end[j];
                    Expnumbers = start[j].ToString() + ", " +
end[j].ToString();
                }

                //to ensure unique value of each sample timeframes
                if (!sqlite3.isExpnumbersExist(database, Expnumbers))
                {
                    //insert to table sampleranges
                    for (int j = 0; j < nooftimeframes; ++j)
                    {
                        sql = sql = "insert into SampleRanges values(" +
Samplerangeid + ", " + start[j] + ", " + end[j] + ");";
                        sqlite3.iExecuteNonQueryInsert(sql, database);
//sqlite3.dll wrapper function
                    }

                    ++Samplerangeid;
                    sqlhead = "insert into TopSample select CType,
count(*), " + Samplerangeid + " from Type  where ";

                    //insert to table topsample
                    sql = sqlhead + sqlmid + sqltail;
                    sqlite3.iExecuteNonQueryInsert(sql, database);

                    //further table insertion...
                    while (matchcount < noofExpToMatch)
                    {
                        //get random experiment no.
                        ExpNo_tomatch = random.Next(maxnum + 1, _maxexpno);

                        //insert into table sampledetects
                        sql = "insert into SampleDetects values (" +
Expnumbers + ", " + ExpNo_tomatch.ToString() + ");";
                        sqlite3.iExecuteNonQueryInsert(sql, database);

                        //insert into table topsamplematches
                        sql = "insert into TopSampleMatches select a.CType,
a.Samplerangeid, b.drawnumber from TopSample a join Data b  where
a.CType              in (b.Type1, b.Type2, b.Type3) and b.ExperimentNumber
= " + ExpNo_tomatch + " and         a.Samplerangeid = " + Samplerangeid +
";";

                        sqlite3.iExecuteNonQueryInsert(sql, database);

                        ++matchcount;
                    }

                    ++count; //increment count to reach samplesize
                }
            }
        }

</CODE>

If I were to do it this way, it would take me about  30 hours to collect
17000 samples. I believe if I could do it faster using SQLite random number
generator and query instead of C#. Hence, I'd really appreciate any
guidance and help you could give me in transforming the C# code into
SQLite  queries to achieve faster random sampling collections. Thank's very
much for your time and any help you could give me.
Cheers, Rick.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to