SENTIMENT ANALYSIS AND CLUSTERING OF ISP SERVICE USERS BASED ON SOCIAL MEDIA PLATFORM X IN INDONESIA USING K-MEANS METHOD
Abstract
This study analyzes user sentiment towards Internet Service Providers (ISPs) in Indonesia using the social media platform X (formerly Twitter). Data was collected using web scraping and processed with TF-IDF to convert text into numerical representations. Sentiment was determined using the NLP BERT model. The K-Means Clustering method was used to group negative tweets based on content similarity. The data consists of 6.000 negative tweets. The analysis identified three main clusters: technical issues (network disruptions, high prices, slow connections), customer service and interactions, and communication and customer satisfaction. Inter-cluster distances were: Cluster 1 and Cluster 2 (0.460), Cluster 1 and Cluster 3 (0.349), Cluster 2 and Cluster 3 (0.341). Intra-cluster variations were: Cluster 1 (0.140), Cluster 2 (0.113), Cluster 3 (0.064). Managerial implications include the need to improve technical service quality, customer service responsiveness, and billing transparency. The study's limitations include the limited amount of data and potential bias. Future research is suggested to use regression models to predict customer satisfaction based on complaint patterns and user sentiment.
Downloads
References
Annur, C. M. (2023). Jumlah Pengguna Twitter Indonesia Duduki Peringkat ke-4 Dunia per Juli 2023. Katadata. https://databoks.katadata.co.id/media/statistik/5cb357372e82c2d/jumlah-pengguna-twitter-indonesia-duduki-peringkat-ke-4-dunia-per-juli-2023
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide. SPSS Inc, 9(13), 1–73.
Chowdhury, G. G., & Chowdhury, S. (2003). Introduction to digital libraries. Facet publishing.
Devlin, J. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. ArXiv Preprint ArXiv:1810.04805.
Hashfi, F., Sugiarto, D., & Mardianto, I. (2022). Sentiment Analysis of An Internet Provider Company Based on Twitter Using Support Vector Machine and Naïve Bayes Method. Ultimatics: Jurnal Teknik Informatika, 14(1), 1–6.
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666.
Kyriakou, K.-I. D., & Tselikas, N. D. (2022). Complementing JavaScript in High-Performance Node. js and Web Applications with Rust and WebAssembly. Electronics, 11(19), 3217.
Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2), 129–137.
Macqueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of 5-Th Berkeley Symposium on Mathematical Statistics and Probability/University of California Press.
NLPTOWN. (2021). bert-base-multilingual-uncased-sentiment. Hugging Face. https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment/blob/main/README.md
Ruth, E. (2013). Deskripsi kualitas layanan jasa akses internet di Indonesia dari sudut pandang penyelenggara. Buletin Pos Dan Telekomunikasi, 11(2), 137–146.
Wang, W., & Liu, Y. (2022). Distributed Optimization of Social Welfare and Regulation in Industrial Economy. Mathematical Problems in Engineering, 2022(1), 3232321.
Widi, S. (2023). Pengguna Media Sosial di Indonesia Sebanyak 167 Juta pada 2023. DataIndonesia.ID. https://dataindonesia.id/internet/detail/pengguna-media-sosial-di-indonesia-sebanyak-167-juta-pada-2023
Yani, D. D. A., Pratiwi, H. S., & Muhardi, H. (2019). Implementasi web scraping untuk pengambilan data pada situs marketplace. JUSTIN (Jurnal Sistem Dan Teknologi Informasi), 7(4), 257–262.
Copyright (c) 2024 Marcel Kurniawan, Hendra Achmadi
This work is licensed under a Creative Commons Attribution 4.0 International License.