$$Excel-Macros$$ Re: remove duplicates - is it a good code?

2011-08-02 Thread bpascal123
Hi,

This is another code with the same style as above and similar to it,
this time it highlights instead of delete duplicate values...

I am wondering why these two codes take seconds to execute on about
2000 rows when the Excel delete duplicates takes milliseconds... The
code for highlighting is not equivalent to the Excel function. I
understand highlighting duplicates values goes from top to bottom, it
highlights EVERY duplicates values in a column, it's not taking in
consideration duplicate rows... when the remove duplicate function
delete duplicate rows...

To sum up, I'm wondering why these code take quite some time? What is
the "Remove duplicate" function made with? I understand if I use it in
the code in the first post (I would need more knowledge of Excel
objects, methods, properties I don't have yet) it would avoid the use
of looping through the entire data but doing this with an array in
memory shouldn't take so long???


Option Explicit

Dim TableA() As Variant
Dim TableB() As Long

Sub HighliteDuplicateValues()

Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False

Dim wb As Workbook
Set wb = ThisWorkbook
Dim ws As Worksheet
Set ws = wb.Worksheets(1)

Dim wsLastRow As Long, wsLastCol As Long 'worksheet base 1 array
Dim bigStr As String

ReDim TableA(0, 0)
Dim TaLastR As Long, TaLastC As Long

ReDim TableB(0)
Dim TbLastR As Long, TbLastC As Long
TbLastR = 0

Dim i As Long, j As Long
Dim it As Long, jt As Long
Dim tmp As Long
Dim x As Variant

Dim markValue As String
markValue = "---%%%"

'Copy all values into TableA
'Add one more column in TableA to concatenate rows values

wsLastRow = ws.Range("A1").CurrentRegion.Rows.Count
wsLastCol = ws.Cells(1, Columns.Count).End(xlToLeft).Column

TaLastR = wsLastRow - 2
TaLastC = wsLastCol

ReDim TableA(TaLastR, TaLastC)

For i = 0 To TaLastR
For j = 0 To TaLastC
TableA(i, j) = ws.Cells(i + 2, j + 1).Value
Next j
For jt = 1 To TaLastC
bigStr = CStr(bigStr) & CStr(ws.Cells(i + 2, jt))
Next jt
TableA(i, j - 1) = CStr(bigStr)
bigStr = ""
Next i


'Count duplicate values from concatenation
'Set the  tableB dimensions from counting duplicates
'Copy row numbers of duplicate values into new table

ReDim Preserve TableA(TaLastR, TaLastC + 1)
ReDim TableB(TbLastR + 2)

For i = 0 To TaLastR - 1
If CStr(TableA(i, TaLastC + 1)) = "" Then
For j = 0 To TaLastR
If i <> j Then
If CStr(TableA(i, TaLastC)) = CStr(TableA(j, TaLastC))
Then
TableA(i, TaLastC + 1) = markValue
TableA(j, TaLastC + 1) = markValue
TableB(TbLastR) = i + 2
TableB(TbLastR + 1) = j + 2
TbLastR = TbLastR + 2
ReDim Preserve TableB(TbLastR + 1)
End If
End If
Next j
End If
Next i

ReDim Preserve TableB(TbLastR - 1)

'''Test 1 : Mark duplicates on worksheets for comparaison and counting

For i = 0 To TaLastR
If CStr(TableA(i, TaLastC + 1)) = markValue Then
ws.Cells(i + 2, 3) = TableA(i, TaLastC + 1)
End If
Next i


ws.Range("A1").Select

'Highlight values in worksheet based on row numbers in new table

'Function to sort
x = SortArray(TableB)

'Apply color on duplicate
For i = 0 To TbLastR - 1
tmp = TableB(i)
ws.Cells(tmp, 1).Interior.Color = 65535

Next i

Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True

End Sub


Function SortArray(ByRef Table() As Long) As Variant
Dim tmp As Long
Dim i As Long, j As Long

For i = 0 To UBound(Table) - 1
For j = i + 1 To UBound(Table) - 1
If Table(i) > Table(j) Then
tmp = Table(i)
Table(i) = Table(j)
Table(j) = tmp
End If
Next j
Next i

SortArray = Table()

End Function

-- 
--
Some important links for excel users:
1. Follow us on TWITTER for tips tricks and links : 
http://twitter.com/exceldailytip
2. Join our LinkedIN group @ http://www.linkedin.com/groups?gid=1871310
3. Excel tutorials at http://www.excel-macros.blogspot.com
4. Learn VBA Macros at http://www.quickvba.blogspot.com
5. Excel Tips and Tricks at http://exceldailytip.blogspot.com
 
To post to this group, send email to excel-macros@googlegroups.com

<><><><><><><><><><><><><><><><><><><><><><>
Like our page on facebook , Just follow below link
http://www.facebook.com/discussexcel


$$Excel-Macros$$ Re: remove duplicates - is it a good code?

2011-08-03 Thread Stuart Redmann


On 1 Aug., bpascal123 wrote:
> Hi all Experts,
>
> Below is the code I produced following my learning of vba for Excel
> from excelvbasql.com. I would like to know if there is anything to be
> done about this code. I'm looking to learn from anyone who can share
> his/her experience.
>
> Cheers,


The code shows multiple deficiencies. I'll mark them at the places
where they appear.


> Option Explicit
>
> Sub DeleteDupl2()


Why does your sub not take any parameters? I'd have expected that it
would at least need to know from which Excel.Range it should remove
the duplicates.


> Application.Calculation = xlCalculationManual
> Application.ScreenUpdating = False
>
> Dim wb As Workbook
> Dim ws As Worksheet
> Set wb = ThisWorkbook
> Set ws = wb.Worksheets("sheet1")


A sheet named "sheet1" may not exist. Your code does not deal with
this gracefully.


> Dim lastr As Long, lastc As Long
> Dim i As Integer, j As Integer
> Dim icopy As Integer


This variable declaration style looks a lot like C. It is quite
confusing. Why don't you declare the variables at the places where you
actually use them?


> Dim Table() As Variant 'Array for all values
> Dim TableOK() As Variant 'Array for unique values


The variable names are quite non-descriptive. Instead of "Table" you
should name them "UnfilteredData" or "OriginalValues" or
"ValuesWithDuplicates". The same goes for "TableOK".


> Dim iOkSize As Integer, jOkSize As Integer 'Rows and column size
> for TableOK
> Dim iOK As Integer, jOk As Integer
> Dim idD As String 'to concatenate all values in a rows
> Dim idj As Integer 'loop variable to concatenate idD string
> Dim deleteSt As String
> deleteSt = "---%%%"
>
> lastr = ws.Cells(Rows.Count, 1).End(xlUp).Row
> lastc = ws.Range("A1").End(xlToRight).Column
> ReDim Table(lastr - 2, lastc)


If you invoke this sub on an empty sheet, you'll try to ReDim Table to
(-1, 256), which of course does not work. Furthermore, lastc will be
256 if there is only data in first column of the sheet. Have a look at
Worksheet.UsedRange, this is probably what you want. But keep in mind
that DeleteDupl2 should work on a range that is passed as input
parameter. Then you should add Subs like DeleteDupl2ForSelection and
DeleteDupl2ForUsedRange that can be launched via "Execute Macro" and
launch DeleteDupl2 with the appropriate parameter.


> ReDim TableOK(0, 0)
>
> '''Copy data into Array Table
> '''Concatenate column values into an addito
> For i = 2 To lastr
> For j = 0 To lastc - 1
> Table(i - 2, j) = ws.Cells(i, j + 1)
> Next j
> For idj = 1 To lastc
> idD = CStr(idD) & CStr(ws.Cells(i, idj))
> Next idj
> Table(i - 2, j) = CStr(idD)
> idD = ""
> Next i
>
> '''mark duplicates but keep untouched the original value
> For i = 0 To lastr - 2
> j = lastc
> idD = Table(i, j)
> For j = i + 1 To lastr - 2
> If CStr(Table(j, lastc)) = CStr(idD) And
> Right(CStr(Table(j, lastc)), 6) <> "---%%%" Then
> Table(j, lastc) = CStr(Table(j, lastc)) & "---%%%"
> End If
> Next j
> Next i
>
> Range(ws.Cells(2, 1), ws.Cells(lastr, lastc)).ClearContents
>
> '''Count unique values in Table
> For i = 0 To lastr - 2
> If CStr(Right(Table(i, lastc), 6)) <> deleteSt Then
> iOkSize = iOkSize + 1
> End If
> Next i
>
> iOkSize = iOkSize - 1
> jOkSize = lastc
>
> ReDim TableOK(iOkSize, jOkSize)
>
> ''Copy unique values into TableOK"
> For i = 0 To lastr - 2
> If CStr(Right(Table(i, lastc), 6)) <> deleteSt Then
> For j = 0 To lastc
> TableOK(iOK, jOk) = Table(i, j)
> jOk = jOk + 1
> Next j
> iOK = iOK + 1
> jOk = 0
> End If
> Next i
>
> ''Copy unique values in worksheet
>
> For i = 0 To iOkSize
> For j = 0 To jOkSize - 1
> ws.Cells(i + 2, j + 1) = TableOK(i, j)
> Next j
> Next i
>
> Application.Calculation = xlCalculationAutomatic
> Application.ScreenUpdating = True
>
> End Sub

Your algorithm works OK, but as you have noted, it is quite slow.
There are sevaral ways to make it faster:
(A) Replace the loops that read and write the data into the arrays by
a read operation that reads more than one cell at a time. You'll have
to work with VARIANT arrays, though, as this is what Excel uses
internally.
(B) Don't use strings as key for your array. Strings are quite slow.
Besides you'll get into trouble if you don't use some delimiting
character: With your current implementation the cells  aa  |  bb
will receive the same key as the cells   a  |   abb, so that your
algorithm will detect a false duplicate. Note that if you compare the
contents of the cells as strings, you'll be much slower when the cell
actually contains a 

Re: $$Excel-Macros$$ Re: remove duplicates - is it a good code?

2011-08-02 Thread Jai
HOW CAN REMOVE THE DUPLICATE VALUE IN EXCEL SHEET WITH THIS CODING
















On Wed, Aug 3, 2011 at 5:14 AM, bpascal123 wrote:

> Hi,
>
> This is another code with the same style as above and similar to it,
> this time it highlights instead of delete duplicate values...
>
> I am wondering why these two codes take seconds to execute on about
> 2000 rows when the Excel delete duplicates takes milliseconds... The
> code for highlighting is not equivalent to the Excel function. I
> understand highlighting duplicates values goes from top to bottom, it
> highlights EVERY duplicates values in a column, it's not taking in
> consideration duplicate rows... when the remove duplicate function
> delete duplicate rows...
>
> To sum up, I'm wondering why these code take quite some time? What is
> the "Remove duplicate" function made with? I understand if I use it in
> the code in the first post (I would need more knowledge of Excel
> objects, methods, properties I don't have yet) it would avoid the use
> of looping through the entire data but doing this with an array in
> memory shouldn't take so long???
>
>
> Option Explicit
>
> Dim TableA() As Variant
> Dim TableB() As Long
>
> Sub HighliteDuplicateValues()
>
> Application.Calculation = xlCalculationManual
> Application.ScreenUpdating = False
>
> Dim wb As Workbook
> Set wb = ThisWorkbook
> Dim ws As Worksheet
> Set ws = wb.Worksheets(1)
>
> Dim wsLastRow As Long, wsLastCol As Long 'worksheet base 1 array
> Dim bigStr As String
>
> ReDim TableA(0, 0)
> Dim TaLastR As Long, TaLastC As Long
>
> ReDim TableB(0)
> Dim TbLastR As Long, TbLastC As Long
> TbLastR = 0
>
> Dim i As Long, j As Long
> Dim it As Long, jt As Long
> Dim tmp As Long
> Dim x As Variant
>
> Dim markValue As String
> markValue = "---%%%"
>
> 'Copy all values into TableA
> 'Add one more column in TableA to concatenate rows values
>
> wsLastRow = ws.Range("A1").CurrentRegion.Rows.Count
> wsLastCol = ws.Cells(1, Columns.Count).End(xlToLeft).Column
>
> TaLastR = wsLastRow - 2
> TaLastC = wsLastCol
>
> ReDim TableA(TaLastR, TaLastC)
>
> For i = 0 To TaLastR
>For j = 0 To TaLastC
>TableA(i, j) = ws.Cells(i + 2, j + 1).Value
>Next j
>For jt = 1 To TaLastC
>bigStr = CStr(bigStr) & CStr(ws.Cells(i + 2, jt))
>Next jt
>TableA(i, j - 1) = CStr(bigStr)
>bigStr = ""
> Next i
>
>
> 'Count duplicate values from concatenation
> 'Set the  tableB dimensions from counting duplicates
> 'Copy row numbers of duplicate values into new table
>
> ReDim Preserve TableA(TaLastR, TaLastC + 1)
> ReDim TableB(TbLastR + 2)
>
> For i = 0 To TaLastR - 1
>If CStr(TableA(i, TaLastC + 1)) = "" Then
>For j = 0 To TaLastR
>If i <> j Then
>If CStr(TableA(i, TaLastC)) = CStr(TableA(j, TaLastC))
> Then
>TableA(i, TaLastC + 1) = markValue
>TableA(j, TaLastC + 1) = markValue
>TableB(TbLastR) = i + 2
>TableB(TbLastR + 1) = j + 2
>TbLastR = TbLastR + 2
>ReDim Preserve TableB(TbLastR + 1)
>End If
>End If
>Next j
>End If
> Next i
>
> ReDim Preserve TableB(TbLastR - 1)
>
> '''Test 1 : Mark duplicates on worksheets for comparaison and counting
>
> For i = 0 To TaLastR
>If CStr(TableA(i, TaLastC + 1)) = markValue Then
>ws.Cells(i + 2, 3) = TableA(i, TaLastC + 1)
>End If
> Next i
>
>
> ws.Range("A1").Select
>
> 'Highlight values in worksheet based on row numbers in new table
>
> 'Function to sort
> x = SortArray(TableB)
>
> 'Apply color on duplicate
> For i = 0 To TbLastR - 1
>tmp = TableB(i)
>ws.Cells(tmp, 1).Interior.Color = 65535
>
> Next i
>
> Application.Calculation = xlCalculationAutomatic
> Application.ScreenUpdating = True
>
> End Sub
>
>
> Function SortArray(ByRef Table() As Long) As Variant
>Dim tmp As Long
>Dim i As Long, j As Long
>
>For i = 0 To UBound(Table) - 1
>For j = i + 1 To UBound(Table) - 1
>If Table(i) > Table(j) Then
>tmp = Table(i)
>Table(i) = Table(j)
>Table(j) = tmp
>End If
>Next j
>Next i
>
>SortArray = Table()
>
> End Function
>
> --
>
> --
> Some important links for excel users:
> 1. Follow us on TWITTER for tips tricks and links :
> http://twitter.com/exceldailytip
> 2. Join our LinkedIN group @ http://www.linkedin.com/groups?gid=1871310
> 3. Excel tutorials at http://www.excel-macros.blogspot.com
> 4. Learn VBA Macros at http://www.quickvba.blogspot.com
> 5. Excel Tips and Tricks at http://exceldailytip.blogspot.com
>
> To post to this group, send email to excel-macros@googlegroups.com
>
> <><><><><><><><><><><><><><><><><><><><><><>
> Like our page on facebook , Just follow below link
> http://www.facebook.com/discussexcel
>

-- 
---

Re: $$Excel-Macros$$ Re: remove duplicates - is it a good code?

2011-08-02 Thread NOORAIN ANSARI
Dear Jai,

Please use below code..

Sub remove_duplicates()
Dim s As Range
For Each s In Selection
s.RemoveDuplicates 1, xlYes
Next
End Sub


-- 
Thanks & regards,
Noorain Ansari
*http://noorain-ansari.blogspot.com/* 

On Wed, Aug 3, 2011 at 10:28 AM, Jai  wrote:

>
>
>
>
>
> HOW CAN REMOVE THE DUPLICATE VALUE IN EXCEL SHEET WITH THIS CODING
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Wed, Aug 3, 2011 at 5:14 AM, bpascal123 wrote:
>
>> Hi,
>>
>> This is another code with the same style as above and similar to it,
>> this time it highlights instead of delete duplicate values...
>>
>> I am wondering why these two codes take seconds to execute on about
>> 2000 rows when the Excel delete duplicates takes milliseconds... The
>> code for highlighting is not equivalent to the Excel function. I
>> understand highlighting duplicates values goes from top to bottom, it
>> highlights EVERY duplicates values in a column, it's not taking in
>> consideration duplicate rows... when the remove duplicate function
>> delete duplicate rows...
>>
>> To sum up, I'm wondering why these code take quite some time? What is
>> the "Remove duplicate" function made with? I understand if I use it in
>> the code in the first post (I would need more knowledge of Excel
>> objects, methods, properties I don't have yet) it would avoid the use
>> of looping through the entire data but doing this with an array in
>> memory shouldn't take so long???
>>
>>
>> Option Explicit
>>
>> Dim TableA() As Variant
>> Dim TableB() As Long
>>
>> Sub HighliteDuplicateValues()
>>
>> Application.Calculation = xlCalculationManual
>> Application.ScreenUpdating = False
>>
>> Dim wb As Workbook
>> Set wb = ThisWorkbook
>> Dim ws As Worksheet
>> Set ws = wb.Worksheets(1)
>>
>> Dim wsLastRow As Long, wsLastCol As Long 'worksheet base 1 array
>> Dim bigStr As String
>>
>> ReDim TableA(0, 0)
>> Dim TaLastR As Long, TaLastC As Long
>>
>> ReDim TableB(0)
>> Dim TbLastR As Long, TbLastC As Long
>> TbLastR = 0
>>
>> Dim i As Long, j As Long
>> Dim it As Long, jt As Long
>> Dim tmp As Long
>> Dim x As Variant
>>
>> Dim markValue As String
>> markValue = "---%%%"
>>
>> 'Copy all values into TableA
>> 'Add one more column in TableA to concatenate rows values
>>
>> wsLastRow = ws.Range("A1").CurrentRegion.Rows.Count
>> wsLastCol = ws.Cells(1, Columns.Count).End(xlToLeft).Column
>>
>> TaLastR = wsLastRow - 2
>> TaLastC = wsLastCol
>>
>> ReDim TableA(TaLastR, TaLastC)
>>
>> For i = 0 To TaLastR
>>For j = 0 To TaLastC
>>TableA(i, j) = ws.Cells(i + 2, j + 1).Value
>>Next j
>>For jt = 1 To TaLastC
>>bigStr = CStr(bigStr) & CStr(ws.Cells(i + 2, jt))
>>Next jt
>>TableA(i, j - 1) = CStr(bigStr)
>>bigStr = ""
>> Next i
>>
>>
>> 'Count duplicate values from concatenation
>> 'Set the  tableB dimensions from counting duplicates
>> 'Copy row numbers of duplicate values into new table
>>
>> ReDim Preserve TableA(TaLastR, TaLastC + 1)
>> ReDim TableB(TbLastR + 2)
>>
>> For i = 0 To TaLastR - 1
>>If CStr(TableA(i, TaLastC + 1)) = "" Then
>>For j = 0 To TaLastR
>>If i <> j Then
>>If CStr(TableA(i, TaLastC)) = CStr(TableA(j, TaLastC))
>> Then
>>TableA(i, TaLastC + 1) = markValue
>>TableA(j, TaLastC + 1) = markValue
>>TableB(TbLastR) = i + 2
>>TableB(TbLastR + 1) = j + 2
>>TbLastR = TbLastR + 2
>>ReDim Preserve TableB(TbLastR + 1)
>>End If
>>End If
>>Next j
>>End If
>> Next i
>>
>> ReDim Preserve TableB(TbLastR - 1)
>>
>> '''Test 1 : Mark duplicates on worksheets for comparaison and counting
>>
>> For i = 0 To TaLastR
>>If CStr(TableA(i, TaLastC + 1)) = markValue Then
>>ws.Cells(i + 2, 3) = TableA(i, TaLastC + 1)
>>End If
>> Next i
>>
>>
>> ws.Range("A1").Select
>>
>> 'Highlight values in worksheet based on row numbers in new table
>>
>> 'Function to sort
>> x = SortArray(TableB)
>>
>> 'Apply color on duplicate
>> For i = 0 To TbLastR - 1
>>tmp = TableB(i)
>>ws.Cells(tmp, 1).Interior.Color = 65535
>>
>> Next i
>>
>> Application.Calculation = xlCalculationAutomatic
>> Application.ScreenUpdating = True
>>
>> End Sub
>>
>>
>> Function SortArray(ByRef Table() As Long) As Variant
>>Dim tmp As Long
>>Dim i As Long, j As Long
>>
>>For i = 0 To UBound(Table) - 1
>>For j = i + 1 To UBound(Table) - 1
>>If Table(i) > Table(j) Then
>>tmp = Table(i)
>>Table(i) = Table(j)
>>Table(j) = tmp
>>End If
>>Next j
>>Next i
>>
>>SortArray = Table()
>>
>> End Function
>>
>> --
>>
>> --
>> Some important links for excel users:
>> 1. Follow us on TWITTER for tips tricks and links :
>> http://twitter.com/exceldailytip
>> 2. Join our LinkedIN g