Wednesday, 1 July 2015

Hash Map Initial Capacity and Load Factor


When we are create an instance of HashMap, there are two parameters which can affect the hashmap performance.
  1. Initial Capacity
  2. Load Factor
Initial Capacity


Capacity means number of buckets in the hash table and Initial Capacity means capacity at the time of hashtable created.

Default Initial Capacity in the HashMap and LinkedHashMap is 16.But TreeMap does not have an Initial capacity.

Load Factor

It represent at what level the hash map capacity should be doubled. The default value of load factor is 0.75

There is another property is “threshold” which is calculate using below formula.

Formula:
                Threshold = Initial Capacity * load factor

Example
                Default value for Initial Capacity is 16
                Default value for Load Factor is 0.75

                Threshold = 16 * 0.75
                Threshold = 12

When hashmap completes the fill up its threshold values to 12, then hashmap capacity should get doubled.

NOTE: I tried this formula and debug the code using JAVA 6, JAVA 7 and JAVA 8. The formula is working fine for both JAVA 6 and JAVA 8.

But it is not rehashing the initial capacity for JAVA 7. I checked it twice and realized that it is rehashing the capacity on the basis of initial capacity, that means, when hashmap completes the fill up its initial capacity value to 16, then hashmap capacity should get doubled.

Let’s take an example: 

public class HashMapDemo {

       public static void main(String[] args) {

              Map<Integer,String> mapOne = new HashMap<Integer,String>(4);
             
              mapOne.put(1, "One");
              mapOne.put(2, "One");
              mapOne.put(3, "One");
              mapOne.put(4, "One");
             
              mapOne.put(5, "One");
              mapOne.put(6, "One");
              mapOne.put(7, "One");
              mapOne.put(8, "One");
             
              mapOne.put(9, "One");
              mapOne.put(10, "One");
              mapOne.put(11, "One");
              mapOne.put(12, "One");
             
              mapOne.put(13, "One");
              mapOne.put(14, "One");
              mapOne.put(15, "One");
              mapOne.put(16, "One");
             
              mapOne.put(17, "One");
              mapOne.put(18, "One");
              mapOne.put(19, "One");
              mapOne.put(20, "One");
             
              mapOne.put(21, "One");
              mapOne.put(22, "One");
              mapOne.put(23, "One");
              mapOne.put(24, "One");
             
              mapOne.put(25, "One");
              mapOne.put(26, "One");
              mapOne.put(27, "One");
              mapOne.put(28, "One");
       }
                    
In the above example, I have created a map reference with its initial capacity 4 and have default load factor is 0.75.


Then,
                Threshold = initial capacity * 0.75
                Threshold = 4 * 0.75
                Threshold = 3

Verify it from debug information




When map reference completes the threshold value then it doubles the initial capacity.




JAVA API general rule say that the default value of load factor (0.75) offers a good tradeoff between time and space cost.

As suggested that, we need to minimize the number of rehashing operation in HashMap. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

Suppose                                 

Number of entries in HashMap = 100
Initial Capacity = 16 (default value)
Load Factor = 0.75 (default value)

As per above statement,

Number of entries in hashmap is greater than Initial capacity so definitely map reference requires rehash operation.

But we override the default value of Map initial capacity to 200, then rehash operation is not required.
Map<Integer,String> mapOne = new HashMap<Integer,String>(200);


 

No comments:

Post a Comment