Python advanced #002 shallow copy() and deepcopy() (2020)
1. Introduction to copy module
There are two copy method in the copy module, copy.copy() and copy.deepcopy(). Those two methods could cause ambiguity in some cases. So it is crutial that we understant it clearly.
1.1 copy.copy()
There are two lists a and b:
>>> a = [11,22]
>>> b = [33,44]
In python, the two lists are stored in different addresses in memory. The variable names "a" and "b" are just like labels attached to the lists and they are also stored in memory too and they know the addresses of those lists.
Next, we calculate the addresses of those two lists by using built-in function id().
>>> id(a)
43440616
>>> id(b)
39008264
As you could see, these two have different addresses which means they occupy separate spaces in memory.
Next, we create a list "c" out of those two lists "a" and "b":>>> c = [a, b]
>>> id(c)
39007720
As you could see, "c" points to a diffrent address:
So, now let's find the addresses of the c[0] and c[1]:
>>> id(c[0])
43440616
>>> id(c[1])
39008264
As you could see, we got the same address as the "a" and "b" lists which means that the two lists inside the "c" list are just references to the original "a" and "b" lists instead of new lists stored in other places in memory.
Let's see the following code:
>>> d = c
>>> id(d) == id(c)
True
We assign "c" to "d" and find that "d" and "c" points to the same address.
Let's see another piece of snippet:
>>> import copy
>>> e = copy.copy(c)
>>> id(e) == id(c)
False
>>> id(e)
44184616
When we invoke copy.copy(c) method on "c" and assign its return value to "e" and find out that "e" points to a different address:
So, invoking copy.copy(c) on "c" creates a new list that stored in other place in memory.
Next, let's check whether the elements inside the new list "e" are still references to the original "a" and "b"
>>> id(e[0]) == id(a)
True
>>> id(e[1]) == id(b)
True
As you could see, the elements inside list "e" are still references to the original "a" and "b". Let's prove it by modifying "a" or "b"; if those changes will be reflected by "e", our conclusion is true.
>>> a.append('aa')
>>> e
[[11, 22, 'aa'], [33, 44]]
As you could see, modifying "a" will be reflected by "e".
1.2 copy.deepcopy()
Now, let's write the following code:
>>> f = copy.deepcopy(c)
>>> id(f) == id(c)
False
>>> id(f)
43490504
>>> f
[[11, 22], [33, 44]]
As you could see, "f" points to a different location in memory which we would expect. Now let's see if its element points to the original "a" or "b".
>>> id(f[0]) == id(a)
False
>>> id(f[1]) == id(b)
False
>>> id(f[0])
44184136
>>> id(f[1])
44184936
Wow! It is obvious from our test, element list in our list "f" points to different locations in memory. Hence when we invoke copy.deepcopy(c) on "c", not only did it create a copy of outer list, but also create a copy of its elements inside it.
Let's test it by modifying "a" or "b".
>>> a.append("aa")
>>> a
[11, 22, 'aa']
>>> f
[[11, 22], [33, 44]]
As we expect, modifying "a" will not be reflected by "f".
1.3 copy immutable objects
In the previous topics, our object is a mutable object list. But if we copy an immutable objects such as a tuple, what will happen?
>>> import copy
>>> a = [11,22]
>>> b = [33,44]
>>> c = (a, b)
>>> d = copy.copy(c)
>>> d
([11, 22], [33, 44])
>>> id(c) == id(d)
True
Our "c" is a tuple of two lists; when we invoke copy.copy() on "c", it does not copy at all. Both "d" and "c" reference to the same address, so id(c) == id(d) returns True. So "d"'s elements will not be copied too.
>>> id(d[0]) == id(a)
True
>>> id(d[1]) == id(b)
True
Now, let's use deepcopy():
>>> e = copy.deepcopy(c)
>>> e
([11, 22], [33, 44])
>>> id(e) == id(c)
False
>>> id(e[0]) == id(a)
False
>>> id(e[1]) == id(b)
False
However, our deepcopy() method did actually make a copy of "c" and two copies of "a" and "b" respectively.
2. Other operations that corresponds to copy.copy()
In python, there are other operations that could achieve the same function as the copy.copy() does.
2.1 list slicing
>>> l = [11,22,33,44]
>>> m = l[:]
>>> id(l)
44262824
>>> id(m)
44262312
>>> n = copy.copy(l)
>>> id(n)
44262888
We could use l[:] to make a shallow copy of the original list like the copy.copy() method.
>>> a = [11,22]
>>> b = [33,44]
>>> c = [a,b]
>>> d = c[:]
>>> id(d) == id(c)
False
>>> id(d[0]) == id(a)
True
>>> id(d[1]) == id(b)
True
As you could see, [:] only makes a shallow copy.
2.2 passing an mutable objects to a function as arguments
>>> l = ['a', 'b', 'c']
>>> def f(l):
l.append('d')
>>> f(l)
>>> l
['a', 'b', 'c', 'd']
When you pass an mutable objects such as a list to the function's argument, it only passes the reference to the original list; so changes made to the list in the function will also be reflected by the original list outside the function.
If you want to make a shallow copy or deepcopy of the original list, just use:
def f(l):
l = copy.copy(l)
l.append('d')
def f(l):
l = copy.deepcopy(l)
l.append('d')
2.3 The copy() method of list, dict and set sequences
>>> l = [11,22,33,44]
>>> ll = l.copy()
>>> id(l) == id(ll)
False
>>> d = {1:[11,22], 2:[33,44]}
>>> dd = d.copy()
>>> id(d) == id(dd)
False
>>> d[1].append(55)
>>> d
{1: [11, 22, 55], 2: [33, 44]}
>>> dd
{1: [11, 22, 55], 2: [33, 44]}
>>> s = {11,22,33,44}
>>> type(s)
<class 'set'>
>>> ss = s.copy()
>>> id(s) == id(ss)
False
As you could tell, the copy() method just made a shallow copy if the sequence contains the mutable objects.
Comments
Post a Comment