Genetic Diversity Study through K-means Clustering in Germplasm Accessions of Green gram [Vigna radiata (L.)] Under Drought Condition

Kanavi, M. S. P., Prakash Koler, Somu, G. and N. Marappa

  • Page No:  138 - 147
  • Published online: 16 Apr 2020
  • DOI : HTTPS://DOI.ORG/10.23910/IJBSM/2020.11.2.2078

  • Abstract
  •  kanavi.uasb@gmail.com

An experiment was conducted to evaluate 205 green gram germplasm accessions along with five check entries for drought tolerance using augmented design during summer 2015 by imposing drought stress condition. Observations were recorded on 17 quantitative traits. ANOVA revealed high significant differences among germplasm accessions for yield, yield component traits and also for drought tolerance traits. Mean squares attributable to ‘Genotypes vs check entries’ were significant for all the traits except seeds per podand relative water content. Based on K-means clustering, all the 205 germplasm accessions were grouped into seven different clusters. Cluster V was the largest with 38 genotypes followed by cluster I with 36, cluster III and VII with 28, cluster II with 27, cluster IV with 25 and cluster VI with 23 genotypes.  The mode of distribution of genotypes coming from different geographical regions into various clusters was at random indicating that the genotypes originating from different agro-climatic regions grouped together into different clusters showing no parallelism between genetic diversity and geographical distribution. The maximum inter cluster distance was recorded between the clusters I and VI (208.17) followed by cluster V and VI (168.52). The minimum inter cluster distance was recorded between the clusters IV and V (45.01) followed by cluster IV and VII (46.97). The maximum intra cluster distance was recorded for the cluster VI (208.17) followed by cluster IV (160.40). The minimum intra cluster distance was recorded for the cluster IV (45.01) followed by cluster V (52.55).

Keywords :   Green gram, drought tolerance, genetic diversity, k-means clustering

  • Introduction

    Green gram [Vigna radiata (L.) Wilczek] also known as mung bean is an important short duration pulse crop of the tropical and subtropical countries of the World. Green gram is the third most important pulse crop of India after chickpea and red gram. It belongs to papilionoid subfamily of the Fabaceae family and has a diploid chromosome number of 2n=2x=22. The word “Pulse” is derived from the Latin word “Puls” meaning pottage i.e. seeds boiled to make porridge or thick soup (Singh et al., 2018). The seeds of green gram are rich in minerals like phosphorus, calcium, vitamins and also contain higher levels of folate and iron than most other legumes (Keatingeet al., 2011). The protein content of pulses are twice that of cereals (20-25%) and almost equal to that of meat and poultry hence commonly pulses are called as the poor man’s meat (Reddy, 2009). India being major pulse producing country in the World which shares 30-35% and 27-28% of the total area and production respectively. Average productivity of mung bean in India is one of the lowest compared to World average. The reason attributable to lower productivity of green gram in India is that the crop is mainly grown as a fallow crop in rabi or late rabi season utilizing available residual soil moisture after harvesting main kharif crop. Hence crop is expected to experience several kinds of droughts during its cropping period. Drought is the major constraint for green gram production due to insufficient and erratic rainfall in India.

     Genetic diversityrefers to thenumber of different allelesof all genes and thefrequency with which they appear in the population. In green gram, the morphological characterization of accessions belonging to cultivated species reveal high genetic variability for a trait other than genetic diversity. Murthy and Arunachalam (1966) emphasized the importance of genetic diversity in selection of parents for hybridization programme in different crops. Genetic diversity present in the germplasm accessions is an important tool for any plant breeding program (Azam et al., 2018). The assessment of genetic variation would provide us a correct picture of the extent of genetic variation, further helping us to improve the genotypes responses to biotic and abiotic stresses (Panigrahi and Baisakh, 2014). The genetic variability offers a working bench for selection intensity and direction which is determined by the crop breeders according to breeding objectives for crop improvement activities in mungbean. Genetic diversity is one of the critical criteria for selection of parents in the hybridization program to isolate best genotypes from transgressive segregants. Cluster analysis in green gram would definitely help plant breeders to identify genetically diverse parents falling in different clusters (Umesh et al., 2017)

    Clustering is a technique where millions of data points are grouped together to form a cluster. Cluster analysis or clustering is to group, categorize or classify a set of objects into many subsets, called clusters, in such a way that the items inside one subset are more “similar” to each other, while “dissimilar” to items inside other subsets. Therefore there must be a way to distinguish between “dissimilar” and “similar” items. K-means clustering is very important and basic clustering technique through which data points are analyzed. K-means is one of the most widely used algorithm for clustering with known sets of median points. Clustering can be used in an exploratory manner to discover meaningful groupings within a data set, or it can serve as the starting point for more advanced analysis (Wang et al., 2019). K-means clustering is an unsupervised machine learning algorithm. It is preferred as the attractiveness lies in its efficiency with O(n*K*i*a), where n, K, I and a equals number of data points, clusters, iterations and attributes respectively. Assessment of genetic diversity is must for any plant breeding programme to identify genetically diverse parents to be involved in hybridization programmes. K-means clustering is a very powerful technique to assess genetic diversity which creates genetically diverse clusters / heterotic groups based on genetic distances between germplasm accessions. Once the heterotic groups are created, then it is easy to identify clusters which are genetically very distant and the germplasm accessions falling in this clusters are also genetically very diverse. Thus it becomes easy for plant breeders to identify genetically diverse germplasm accessions which in-turn will serve as parental lines in crossing programme.  This research was carried out with a purpose to identify genetically diverse drought tolerant genotypes which can be later used as parental lines in plant breeding programmes to develop drought tolerant genotypes.


  • Materials and Methods

    The experiment was conducted at experimental plot of College of Agriculture, Hassan, University of Agricultural Sciences, Bengaluru, India. The experimental site is geographically located at Southern Transitional Zone (Zone-7) of Karnataka with an altitude of 827 m above Mean Sea Level (MSL) and at 33' N latitude and 75° 33' to 76° E38' longitude. The study material consisted of 205 germplasm accessions collected from different research institutions / organizations representing different agro-climatic zones. List of germplasm accessions used in the study with their source is given in Table 1.


    2.1.  Layout of the experiment

    The experiment was conducted in an Augmented Randomized Complete Block Design with 205 germplasm accessions. As per the augmented RCBD, the check entries were replicated twice randomly in each block. There were 5 blocks, each block had 5 plots of size 3x3 m2 thus each block size was 15 m2. The gross area of experimental plot was 75 m2. The row spacing was 30 cm and inter plant distance was 10 cm. The experiment was conducted during summer 2015. Recommended crop production practices were followed to raise healthy crop.

    2.2.  Imposing drought condition

    Drought condition was imposed by withholding irrigation 25 days after sowing (Bangar  et al., 2019). Since the experiment was conducted during summer season, there were no unpredicted rains during the entire cropping period hence the drought condition was effectively imposed. The rainfall data of experimental site during the cropping period is given in Table 2.


    2.3.  Plant sampling and data collection

    Observations were recorded on five randomly chosen competitive plants from each germplasm accession for all the characters except days to 50% flowering and days to maturity, which were recorded on plot basis. The values of five competitive plants were averaged and expressed as mean of the respective characters. The observations were taken on the traits like; Days to 50% flowering, Days to maturity, Plant height (cm), Clusters  plant-1, Pods cluster-1, Pods per plant, Pod length (cm), Seeds pod-1, test weight, Threshing %, Harvest index (%), SCMR (SPAD Chlorophyll meter reading), Leaf water potential (Mpa), Proline content (μg g−1 ), Relative water content, Specific leaf area and Seed yield per plant.

    2.4.  Statistical analysis

    2.4.1.  Analysis of variance (ANOVA)

    The quantitative trait mean value of five randomly selected plants in each of the genotype and check entries were used for statistical analysis. ANOVA was performed to partition the total variation among genotypes and check entries into sources attributable to ‘Genotypes+Check entries’, Genotypes’, Check entries’ and Genotypes vs check entries’, following the augmented design as suggested by Federer (1956) using statistical package for augmented design SAS version 9.3 and IndoStat. The adjusted trait mean of each of the genotype was estimated (Federer, 1956) and the same was used for all subsequent statistical analysis.

    2.4.2.  K-means clustering

    The germplasm accessions were classified following ‘k-means clustering’ model as explained by Macqueen (1967) and Forgy (1965).K-means cluster analysis was performed in SAS 9.3 version and NCSS statistical software. The trait means and variances were estimated in each cluster and tested for their homogeneity across the cluster using ‘F’ test and ‘Levene’ test (Levene, 1960)

    The test statistic (W) for Leven’s test was computed as,


  • Results and Discussion

    3.1.  Analysis of variance (ANOVA)

    Analysis of variance revealed highly significant mean squares attributable to germplasm accessions for all the traits. Significant mean squares were recorded for all the traits. (Table 3).


    Mean squares attributable to ‘Genotypes vs check entries’ were significant for all the traits except seeds per podand relative water content. These results suggest significant differences among the germplasm accessions. The germplasm accessions as group differed significantly for all of the traits under investigation, similarly, check entries as group differed significantly for most of the traits under study.

    3.2.  K-means clustering

    K-means clustering intends to partition n objects into k clusters in which each object belongs to the cluster with nearest mean. K-means is a centroid based clustering algorithm. ‘K’ represents the number of clusters, and it is also an input parameter. Each element in the data set is assigned to a cluster center with the smallest distance to it. This method produces exactly k different clusters of greatest possible distinction. K-means cluster analysis is presented in Table 4.


    3.3.  Clustering pattern and composition of group

    Analysis revealed that a wide range of variability existed for all the traits studied indicating the presence of significant variation among the genotypes. Based on the K-means clustering analysis, all the 205 germplasm accessions including five check entries were grouped into seven different clusters as presented in the Table 5 and Figure 1.



    Cluster V was the largest with 38 genotypes followed by cluster I with 36, cluster III and VII with 28, cluster II with 28, cluster IV with 25 and cluster VI with 23 genotypes.  The mode of distribution of genotypes from different geographical regions into various clusters was at random indicating that the genotypes originating from different agro-climatic / geographical regions grouped together into different clusters showing no parallelism between genetic diversity and geographical distribution. Our results are on par with findings of Raje and Rao et al. (2001), Venkateswarlu (2001), Dasgupta et al. (2005), Makeen et al.(2007), Tabasum et al.(2010), Divyaramakrishnan and Savithramma (2014), Suhel  et al. (2015), John et al. (2015), Gunjeet et al.(2015), Muhammad et al. (2016) Wangaet al. (2017),  Kaur et al.(2018), Sharma et al. (2018) and Mohan et al. (2019). Sanhita et al.(2019) reported formation of 4 clusters of mungbean genotypes for bruchid resistance when the data was subject to multivariate analysis.

    3.4.  Intra and inter cluster distances between clusters

    The intra and inter cluster distances are presented in Table 6. The range of inter cluster distance was 45.01 to 208.17. The maximum inter cluster distance was recorded between the clusters I and VI (208.17) followed by cluster V and VI (168.52). The minimum inter cluster distance was recorded between the clusters IV and V (45.01) followed by cluster IV and VII (46.97). The range of intra cluster distance was 74.41 to 134.82. The maximum intra cluster distance was recorded for the cluster VI (208.17) followed by cluster IV (160.40). The minimum intra cluster distance was recorded for the cluster IV (45.01) followed by cluster V (52.55).

    These results suggest that the genotypes grouped in different clusters may be used as potential parental lines for hybridization programmes to develop desirable genotypes as genetic diversity can be best exploited and chances of getting best transgressive segregants are more. The cluster means of 17 characters are presented in Table 6.


    From the data we can conclude that considerable variation exists for all the traits studied. Results showed that genotypes in Cluster V were early flowering (38.92 days) whereas genotypes in cluster VI were late flowering (47.00 days). The genotypes in cluster V were early maturing (66.24 days) whereas genotypes in cluster VI were late maturing (74.96 days). Cluster IV exhibited highest mean for plant height (45.70 cm) whereas the cluster IV showed lowest (25.62). Cluster per plant was highest in cluster III (6.95) and was lowest in cluster VI (1.01). Pods per cluster was highest in V (5.90) and lowest in cluster IV (2.37). Pods per plant was highest in cluster I (25.22) and was lowest in cluster IV (5.88). Pod length was highest in cluster V (6.74) and lowest in cluster VI (4.99). Seeds per pod was highest in cluster VII and was lowest in cluster IV (4.85). Test weight was highest in cluster IV (3.76) and lowest in cluster II (2.84). Threshing percentage was highest in cluster VI (63.82) and lowest in cluster IV (57.90). Harvest index was highest in cluster V (41.93) and lowest in cluster VI (28.72).

    The range of intra cluster distance was 74.41 to 134.82. The maximum intra cluster distance was recorded for the cluster VI (208.17) followed by cluster IV (160.40). The minimum intra cluster distance was recorded for the cluster IV (45.01) followed by cluster V (52.55). These results suggest that the genotypes grouped in different clusters may be used as potential parental lines for hybridization programmes to develop desirable genotypes as genetic diversity can be best exploited and chances of getting best transgressive segregants are more.

    The cluster means of 17 characters are presented in Table 7.


    From the data we can conclude that considerable variation exists for all the traits studied. Results showed that genotypes in Cluster V were early flowering (38.92 days) whereas genotypes in cluster VI were late flowering (47.00 days). The genotypes in cluster V were early maturing (66.24 days) whereas genotypes in cluster VI were late maturing (74.96 days). Cluster IV exhibited highest mean for plant height (45.70 cm) whereas the cluster IV showed lowest (25.62). Cluster plant-1 was highest in cluster III (6.95) and was lowest in cluster VI (1.01). Pods cluster-1 was highest in V (5.90) and lowest in cluster IV (2.37). Pods plant-1 was highest in cluster I (25.22) and was lowest in cluster IV (5.88). Pod length was highest in cluster V (6.74) and lowest in cluster VI (4.99). Seeds pod-1 was highest in cluster VII and was lowest in cluster IV (4.85). Test weight was highest in cluster IV (3.76) and lowest in cluster II (2.84). Threshing percentage was highest in cluster VI (63.82) and lowest in cluster IV (57.90). Harvest index was highest in cluster V (41.93) and lowest in cluster VI (28.72).

    Spad chlorophyll meter readings were highest in cluster V (65.06) and lowest in cluster VI (44.82). Leaf water potential was highest in cluster VII (-3.86) and lowest in cluster VI (-7.16). Proline content was highest in cluster III (124.16) and lowest in cluster VI (77.24). Relative water content was highest in cluster V (89.23) and lowest in cluster VI (44.26). Specific leaf area was highest in cluster VII (219.37) and lowest in cluster I (90.93). Seed yield plant-1 was highest in cluster V (7.38) and lowest in cluster IV (1.05). Three clusters namely; cluster IV, cluster V and cluster VI had maximum representation in terms of having either highest or lowest cluster means for the traits thus forming diverse group of genotypes. Umashankar and Sarkar (2018) have reported similar findings in green gram for the traits plant height, days to maturity, number of pods per plant, protein content and seed yield. Similar findings are also reported by Divyaramakrishnan and Savithramma (2014).


  • Conclusion

    A wide range of variability was existed for all the traits studied indicating the presence of significant variation among the genotypes. Based on the K-means clustering analysis, all the 205 germplasm accessions including five check entries were grouped into seven different clusters The mode of distribution of genotypes from different geographical regions into various clusters was at random indicating that the genotypes originating from different agro-climatic / geographical regions grouped together into different clusters showing no parallelism between genetic diversity and geographical distribution.


  • Acknowledgement

    Kanavi, M. S. P., thanks Director of Research, University of Agricultural Sciences, Bangalore for giving financial assistance to carry out the research work.


  • Reference
  • Azam, M., Hossain, M., Alam, M., Rahman, K., Hossain, M., 2018. Genetic variability, heritability and correlation path analysis in mungbean (Vigna radiata (L.)wilczek). Bangladesh Journal of Agricultural Research 43(3), 407−416.

    Bangar, P., Chaudhury, A., Tiwari, B., Kumar, S., Kumari, R., Kangila, Bhat, K.V., 2019.Morphophysiological and biochemical response of mung bean [Vigna radiata (L.) Wilczek] varieties at different developmental stages under drought stress. Turkish Journal of Biology 43, 58−69.

    Dasgupta, T., Mukherjee, K., Roychoudhury, B., Nath, D., 2005. Genetic diversity of horse gram germplasm.Legume Research28(3), 166−171.

    Divyaramakrishnan, C.K., D.L. Savithramma.,2014. Tailoring genetic diversity of mungbean [Vigna radiata (L). Wilczek] germplasm through principal component and cluster analysis for yield and yield related traits. International Journal of Agronomy and Agricultural Research (IJAAR) 5(2), 94−102.

    Federer, W.T., 1956. Augmented (or hoonuiaku) designs. The Hawaiian Planters’ Record. LV (2), 191-208.

    Forgy, E.W., 1965. Cluster analysis of multivariate data efficiency vs. interpretability of classifications. Biometrics 21, 768−769.

    Ghosh, S., Roy, A., Kundagrami, S., 2019. Diversity analysis of mungbean [Vigna radiata (L.) Wilczek] genotypes for bruchid resistance. Indian Journal of Agricultural Research 53(3), 309−314

    John Kingsly, N.B., Packiaraj, D., Pandiyan, M., Senthil, N., 2015. Tailoring genetic diversity of mungbean (Vigna radiata (L.) Wilczek) germplasm through cluster analysis for yield and yield related traits. Trends in Biosciences 8(12), 3239−3244.

    Kaur, G., Joshi, A., Jain, D., Choudhary, R., Vyas, D., 2015. Diversity analysis of green gram (Vigna radiata (L.) Wilczek) through morphological and molecular markers. Turkish Journal of Agriculture and Forestry 40, 229−240.

    Kaur, G., Joshi, A., Jain, D., 2018. Marker assisted evaluation of genetic diversity in mung bean (Vigna radiata (L.) Wilcezk) genotypes. Brazilian Archives of Biology and Technology61, e18160613.

    Keatinge, J.D.H., Easdown, W.J., Yang, R.Y., Chadha, M.L., Shanmugasundaram, S., 2011. Overcoming chronic malnutrition in a future warming world: the key importance of mung bean and vegetable soybean. Euphytica 180, 129−141.

    Kumar, V.U., Sarkar, K.K., Satish, G., Ramchander, 2017. Tailoring the genetic diversity of mung bean through cluster analysis for yield attributing traits to obtain efficient hybrids. Bulletin of Environment, Pharmacology and Life Sciences 6(3), 365−370.

    Levene, H., 1960. Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford University Press, Palo Alto, 278−292.

    Macqueen, J.B., 1967. Some methods for classification and analysis of multivariate observations. Proceedings of  5th Berkeley Symposium. Mathematical statistics and probability. University of California Press1, 281−297.

    Makeen, K., Abrahim, G., Jan, A., Singh, A.K., 2007. Genetic variability and correlation studies on yield and its components in mung bean (Vigna radiata (L.) Wilezek). Journal of Agronomy6, 216−218.

    Mohan, S., Sheeba, A., Kalaimagal, T., 2019. Genetic Diversity and Association Studies in Greengram [Vigna Radiata (L.) Wilczek]. Legume Research, DOI: 10.18805/LR-4176, 1-6.

    Muhammad, A., Muhammad, A.M., Qamar, U.Z., Muhammad, A.A., 2016. Uncovering the of mung bean (Vigna radiata L. Wilczek) genotypes under saline conditions using k-mean cluster analysis. Journal of Agriculture and Basic Sciences 1(1), 37−44.

    Murthy, B.R., Arunachalam, V., 1966. The nature of genetic divergence in relation to breeding system in some crop plants. Indian Journal of Genetics and Plant Breeding 26, 188−198.

    Panigrahi, K.K., Baisakh, B., 2014. Genetic diversity assessment for yield contributing characters of green gram [Vigna radiata (L.) Wilczek] cultivars from Odisha. Environment and Ecology 32(1A), 294−297.

    Raje, R.S., Rao, S.K., 2001. Genetic diversity in a germplasm collection of mung bean (Vigna radiata (L.) Wilczek). Indian Journal of Genetics and Plant Breeding 61, 50−52.

    Reddy, A.A., 2009. Pulses production technology: status and way forward. Economic and Political Weekly 44(52), 3−80.

    Sharma, S.R., Singh, D., Kumar, P., Khedar, O.P., Varshnay, N., 2018. Assessment of genetic diversity in mungbean [Vigna radiata (L) wilczek] genotypes. International Journal of Genetics 10(7), 471−474.

    Shusen, W., Alex, G., Michael, W.M., 2019. Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds. Journal of Machine Learning Research 20, 1−49.

    Singh, R., Singh, M.K., Singh, A.K., Singh, C., 2018. Pulses production in India: issues and elucidations. Pharma. Innovation7(1), 10−13.

    Suhel, M., Singh, I.P., Bohra, A., Singh, C.M., 2015. Multivariate analysis in green gram [Vigna radiata (L.) Wilczek]. Legume Research 38(6), 758−762.

    Tabasum, A., Saleem, M., Aziz, I., 2010. Genetic variability, trait association and path analysis of yield and yield components in mung bean (Vigna radiata (L.) Wilczek). Pakistan Journal of Botany 42, 915−3924.

    Venkateswarlu, O., 2001. Genetic variability in green gram (Vigna radiata (L.) Wilczek).Legume Research24, 69−70.

    Wanga, L., Baib, P., Yuanc, X., Chena, H., Wanga, S., Chenc, X.,  Chenga, X.,  2017. Genetic diversity assessment of a set of introduced mung bean accessions (Vigna radiate L.) The Crop Journal 6(2), 207−213.


Cite

1.
Kanavi , P MS, Koler P, Somu , G , Marappa N. Genetic Diversity Study through K-means Clustering in Germplasm Accessions of Green gram [Vigna radiata (L.)] Under Drought Condition IJBSM [Internet]. 16Apr.2020[cited 8Feb.2022];11(1):138-147. Available from: http://www.pphouse.org/ijbsm-article-details.php?article=1360

People also read

Research Article

Effect of Different Levels of Pruning on Quality of Custard Apple (Annona squmosa L.)

S. R. Kadam, R. M. Dheware and P. S. Urade

Custard apple (Annona squamosa L.), pruning levels, treatments, quality

Published Online : 01 Oct 2018

Full Research

Integrated Nutrient Management on Growth and Productivity of Rapeseed-mustard Cultivars

P. K. Saha, G. C. Malik, P. Bhattacharyya and M. Banerjee

Nutrient management, variety, rapeseed-mustard, seed yield

Published Online : 07 Apr 2015

Research Article

Social Structure of Mizo Village: a Participatory Rural Appraisal

Lalhmunmawia and Samares Kumar Das

Social structure, Mizoram, Mizo village, PRA

Published Online : 05 Mar 2018